HomeM.Sc Thesis: Fritze Clemens – A Dexterous Multi-Finger Robotic Manipulator Framework for Intuitive Teleoperation and Contact-Rich Imitation Learning
M.Sc Thesis: Fritze Clemens – A Dexterous Multi-Finger Robotic Manipulator Framework for Intuitive Teleoperation and Contact-Rich Imitation Learning
Supervisor: M.Eng Fotios Lygerakis, Univ.-Prof. Dr. Elmar Rückert
Theoretical difficulty: mid Practical difficulty: hard
Abstract
Robotic manipulation in dynamic environments requires systems that can adapt to uncertainties and learn from limited human input. This thesis presents a dexterous multi-finger robotic framework that integrates intuitive teleoperation with self-supervised visuotactile representation learning to enable contact-rich imitation learning. Central to the system is a Franka Emika Panda robotic arm paired with a multi-fingered LEAP Hand equipped with high-resolution GelSight Mini tactile sensors. A Meta Quest 3 teleoperation interface captures natural human demonstrations while collecting multimodal data, including visual, tactile, and joint-state inputs, to train the self-supervised encoders.
The study evaluates two representation learning methods, BYOL and MViTac, under low-data conditions. Extensive experiments on complex manipulation tasks — such as pick-and-place, battery insertion, and book opening—demonstrate that BYOL-trained encoders consistently outperform both MViTac and a ResNet18 baseline, achieving a 60% success rate on the challenging spiked cylinder task. Key findings highlight the critical role of tactile feedback quality, with GelSight sensors delivering robust tactile impressions compared to lower-resolution alternatives. Furthermore, parameter studies reveal how system settings (e.g., reject buffers, movement thresholds) and demonstration selection critically influence task performance.
Despite challenges in scenarios requiring precise visual-tactile coordination, the results validate the potential of self-supervised learning to reduce human annotation effort and facilitate a smooth transition from teleoperated control to autonomous execution. This work provides valuable insights into the integration of hardware and software components, as well as control strategies, demonstrating BYOL’s potential as a promising approach for tactile representation learning in advancing autonomous robotic manipulation.
Milestones
Teleoperation test of the LEAP Hand:
Visual encoder test:
First version of the FrankaArm-control test:
Dataset collection / teleoperation of the whole setup: