M.Sc Thesis: Fritze Clemens – A Dexterous Multi-Finger Robotic Manipulator Framework for Intuitive Teleoperation and Contact-Rich Imitation Learning

image_pdfimage_print

Supervisor: M.Eng Fotios Lygerakis, Univ.-Prof. Dr. Elmar Rückert

Theoretical difficulty: mid
Practical difficulty: hard

 

Abstract

Robotic manipulation in dynamic environments requires systems that can adapt
to uncertainties and learn from limited human input. This thesis presents a dexterous
multi-finger robotic framework that integrates intuitive teleoperation with
self-supervised visuotactile representation learning to enable contact-rich imitation
learning. Central to the system is a Franka Emika Panda robotic arm paired with a
multi-fingered LEAP Hand equipped with high-resolution GelSight Mini tactile sensors.
A Meta Quest 3 teleoperation interface captures natural human demonstrations while
collecting multimodal data, including visual, tactile, and joint-state inputs, to train
the self-supervised encoders.

The study evaluates two representation learning methods, BYOL and MViTac, under
low-data conditions. Extensive experiments on complex manipulation tasks — such as
pick-and-place, battery insertion, and book opening—demonstrate that BYOL-trained
encoders consistently outperform both MViTac and a ResNet18 baseline, achieving
a 60% success rate on the challenging spiked cylinder task. Key findings highlight
the critical role of tactile feedback quality, with GelSight sensors delivering robust
tactile impressions compared to lower-resolution alternatives. Furthermore, parameter
studies reveal how system settings (e.g., reject buffers, movement thresholds) and
demonstration selection critically influence task performance.

Despite challenges in scenarios requiring precise visual-tactile coordination, the
results validate the potential of self-supervised learning to reduce human annotation
effort and facilitate a smooth transition from teleoperated control to autonomous
execution. This work provides valuable insights into the integration of hardware and
software components, as well as control strategies, demonstrating BYOL’s potential as
a promising approach for tactile representation learning in advancing autonomous
robotic manipulation.

Milestones

Teleoperation test of the LEAP Hand:

Visual encoder test:

First version of the FrankaArm-control test:

Dataset collection / teleoperation of the whole setup:

Fully autonomous task execution:

Leave a Reply

Your email address will not be published. Required fields are marked *