Deep Reinforcement Learning for Navigation in Warehouses

Deep Reinforcement Learning (DRL) has demonstrated great success in learning single or multiple tasks from scratch. Various DRL algorithms have been proposed and were applied in a broad class of tasks including chess and video games, robot navigation or robot manipulation. 

In this work, we investigate the potential of DRL in a mapless navigation task within a warehouse. The challenges of the task are the partial observability of the space and the need of effective exploration strategies for fast learning of navigation strategies. 

We trained a mobile robot (the agent) from scratch and compared how different sensor observations could influence the navigation performance. The evaluated sensors are a 360-degree Lidar sensor, only depth image and only RGB image. For Lidar and RGB inputs, we evaluated partial and full observability of the state space. We successfully trained the agent to navigate to a goal with a reward setting that is also applicable to the real world.

Currently, we are extending the work to multi-modal sensor inputs with both Lidar and RGB inputs (RGB image only frontal view) and incorporated self-curriculum learning on a more challenging navigation task in a warehouse and obtained promising initial outcomes.

The video shows learned navigation strategies using a single RGB-D camera mounted at the front of the robot.  The results were obtained after 85.000 interactions (single action excution, e.g. wheel velocity commands). 

This video shows the learnded strategy after 140.000 interactions with the environment. 

Dynamic Control of a CableBot

Building a CableBot and Learning the Dynamics Model and the Controller

Controlling cable driven master slave robots is a challenging task. Fast and precise motion planning requires stabilizing struts which are disruptive elements in robot-assisted surgeries. In this work, we study parallel kinematics with an active deceleration mechanism that does not require any hindering struts for stabilization. 

Reinforcement learning is used to learn control gains and model parameters which allow for fast and precise robot motions without overshooting. The developed mechanical design as well as the controller optimization framework through learning can improve the motion and tracking performance of many widely used cable-driven master slave robots in surgical robotics.

Project Consortium

  • Montanuniversität Leoben

Related Work

H Yuan, E Courteille, D Deblaise (2015). Static and dynamic stiffness analyses of cable-driven parallel robots with non-negligible cable mass and elasticity, Mechanism and Machine Theory, 2015 – Elsevier, link.

MA Khosravi, HD Taghirad (2011). Dynamic analysis and control of cable driven robots with elastic cables, Transactions of the Canadian Society for Mechanical Engineering 35.4 (2011): 543-557, link.

Publications

2019

Rueckert, Elmar; Jauer, Philipp; Derksen, Alexander; Schweikard, Achim

Dynamic Control Strategies for Cable-Driven Master Slave Robots Inproceedings

In: Keck, Tobias (Ed.): Proceedings on Minimally Invasive Surgery, Luebeck, Germany, 2019, (January 24-25, 2019).

Links | BibTeX

Dynamic Control Strategies for Cable-Driven Master Slave Robots

Active transfer learning with neural networks through human-robot interactions (TRAIN)

DFG Project 07/2020-07/2023

In our vision, autonomous robots are interacting with humans at industrial sites, in health care, or at our homes managing the household. From a technical perspective, all these application domains require that robots process large amounts of data of noisy sensor observations during the execution of thousands of different motor and manipulation skills. From the perspective of many users, programming these skills manually or using recent learning approaches, which are mostly operable only by experts, will not be feasible to use intelligent autonomous systems in tasks of everyday life.

In this project, we aim at improving robot skill learning with deep networks considering human feedback and guidance. The human teacher is rating different transfer learning strategies in the artificial neural network to improve the learning of novel skills by optimally exploiting existing encoded knowledge. Neural networks are ideally suited for this task as we can gradually increase the number of transferred parameters and can even transition between the transfer of task specific knowledge to abstract features encoded in deeper layers. To consider this systematically, we evaluate subjective feedback and physiological data from user experiments and elaborate assessment criteria that enable the development of human-oriented transfer learning methods. In two main experiments, we first investigate how users experience transfer learning and then examine the influence of shared autonomy of humans and robots. This will result in a methodical robot skill learning framework that adapts to the users’ needs, e.g., by adjusting the degree of autonomy of the robot to laymen requirements. Even though we evaluate the learning framework focusing on pick and place tasks with anthropomorphic robot arms, our results will be transferable to a broad range of human-robot interaction scenarios including collaborative manipulation tasks in production and assembly, but also for designing advanced controls for rehabilitation and household robots.

Project Consortium

  • Friedrich-Alexander-Universität Erlangen-Nürnberg

  • Montanuniversität Leoben

Links

Details on the research project can be found on the project webpage.

 

Publications

2021

Tanneberg, Daniel; Ploeger, Kai; Rueckert, Elmar; Peters, Jan

SKID RAW: Skill Discovery from Raw Trajectories Journal Article

In: IEEE Robotics and Automation Letters (RA-L), pp. 1–8, 2021, ISSN: 2377-3766, (© 20XX IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.).

Links | BibTeX

SKID RAW: Skill Discovery from Raw Trajectories

Jamsek, Marko; Kunavar, Tjasa; Bobek, Urban; Rueckert, Elmar; Babic, Jan

Predictive exoskeleton control for arm-motion augmentation based on probabilistic movement primitives combined with a flow controller Journal Article

In: IEEE Robotics and Automation Letters (RA-L), pp. 1–8, 2021, ISSN: 2377-3766, (© 20XX IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.).

Links | BibTeX

Predictive exoskeleton control for arm-motion augmentation based on probabilistic movement primitives combined with a flow controller

Cansev, Mehmet Ege; Xue, Honghu; Rottmann, Nils; Bliek, Adna; Miller, Luke E.; Rueckert, Elmar; Beckerle, Philipp

Interactive Human-Robot Skill Transfer: A Review of Learning Methods and User Experience Journal Article

In: Advanced Intelligent Systems, pp. 1–28, 2021.

Links | BibTeX

Interactive Human-Robot Skill Transfer: A Review of Learning Methods and User Experience

2020

Rottmann, N.; Kunavar, T.; Babič, J.; Peters, J.; Rueckert, E.

Learning Hierarchical Acquisition Functions for Bayesian Optimization Inproceedings

In: International Conference on Intelligent Robots and Systems (IROS’ 2020), 2020.

Links | BibTeX

Learning Hierarchical Acquisition Functions for Bayesian Optimization

Xue, H.; Boettger, S.; Rottmann, N.; Pandya, H.; Bruder, R.; Neumann, G.; Schweikard, A.; Rueckert, E.

Sample-Efficient Covariance Matrix Adaptation Evolutional Strategy via Simulated Rollouts in Neural Networks Inproceedings

In: International Conference on Advances in Signal Processing and Artificial Intelligence (ASPAI’ 2020), 2020.

Links | BibTeX

Sample-Efficient Covariance Matrix Adaptation Evolutional Strategy via Simulated Rollouts in Neural Networks

Robert-Bosch-Stiftung LEGO Robotic 07/2019-10/2021

Neuartige Robotertechnologien und künstliche Lernmethoden können Schlüsseltechnologien sein, um  unsere Umwelt zu schützen. Prof. Dr. Elmar Rückert und Herr Ole Pein haben diese Seite ins Leben gerufen, um diese Thematik gemeinsam mit Schülerinnen und Schülern des Carl-Jacob-Burckhardt-Gymnasium in Lübeck zu untersuchen.

Das Projekt wird innerhalb des Wahlpflichtunterrichts am Carl-Jacob-Burckhardt-Gymnasium  in der 8. und 9. Klassenstufe verwirklicht. In der 8. Klasse lernen die Schülerinnen und Schüler Lego-Mindstorms-EV3-Roboter zu konstruieren und zu programmieren. 

Das Projekt basiert auf unserer frei verfügbaren Python Software für LEGO EV3s. Es wird kontinuierlich von einem Team der Universität Lübeck weiterentwickelt und an die Bedürfnisse und Fragestellungen der Schülerinnen und Schüler angepasst.

Das Projekt mit dem Titel „Autonome Elektrofahrzeuge als urbane Lieferanten“ wird im Rahmen des Programms „Our Common Future“ von der Robert Bosch Stiftung gefördert.

Link: https://future.ai-lab.science

Sicheres Autonomes Fahren mit Probabilistischen Neuronalen Netzen

Wir Menschen sind in der Lage unter widrigen Bedingungen z.B. bei eingeschränkter Sicht, oder bei Störungen komplexe Vorgänge wahrzunehmen, vorherzusagen und innerhalb von wenigen Millisekunden zusammenhängende Entscheidungen zu treffen. Mit dem zunehmenden Grad der Automatisierung steigen auch die Anforderungen an künstliche Systeme. Immer komplexere und größere Datenmengen müssen verarbeitet werden um autonome Entscheidungen zu treffen. Mit gängigen KI Ansätzen stoßen wir aufgrund der konvergierenden Miniaturisierung an Grenzen, die z.B. im Bereich des autonomen Fahrens nicht ausreichen, um ein sicheres autonomes System zu entwickeln.

Ziel dieser Forschung ist es probabilistische Vorhersagemodelle in massiv parallelisierbaren neuronalen Netzen zu implementieren und mit diesen komplexe Entscheidungen Aufgrund erlernter interner Vorhersagemodelle zu treffen. Die neuronalen Modelle verarbeiten hoch dimensionale Daten moderner und innovativer taktiler und visueller Sensoren. Wir testen die neuronalen Vorhersage und Entscheidungsmodelle in humanoiden Roboteranwendungen in dynamischen Umgebungen.

Unser Ansatz beruht auf der Theorie der probabilistischen Informationsverarbeitung in neuronalen Netzen und unterscheidet sich somit grundlegend von den gängigen Methoden tiefer neuronaler Netze. Die zugrundeliegende Theorie ermöglicht weitreichende Modelleinblicke und erlaubt neben den Vorhersagen von Mittelwerten auch Unsicherheiten und Korrelationen. Diese zusätzlichen Vorhersagen sind entscheidend für verlässliche, erklärbare und robuste künstliche Systeme und sind eines der größten offenen Probleme in der künstlichen Intelligenz Forschung.

Dieses Projekt wurde mit dem Deutschen KI-Nachwuchspreis der Bilanz Deutschland Wirtschaftsmagazin GmbH geehrt und demonstriert die Wichtigkeit für Grundlagenforschung in der künstlichen Intelligenz.

AI and Learning in Robotics

Robotics AI requires autonomous learning capabilities

The challenges in understanding human motor control, in brain-machine interfaces and anthropomorphic robotics are currently converging. Modern anthropomorphic robots with their compliant actuators and various types of sensors (e.g., depth and vision cameras, tactile fingertips, full-body skin, proprioception) have reached the perceptuomotor complexity faced in human motor control and learning. While outstanding robotic and prosthetic devices exist, current brain machine interfaces (BMIs) and robot learning methods have not yet reached the required autonomy and performance needed to enter daily life.

The groups vision is that four major challenges have to be addressed to develop truly autonomous learning systems. These are, (1) the decomposability of complex motor skills into basic primitives organized in complex architectures, (2) the ability to learn from partial observable noisy observations of inhomogeneous high-dimensional sensor data, (3) the learning of abstract features, generalizable models and transferable policies from human demonstrations, sparse rewards and through active learning, and (4), accurate predictions of self-motions, object dynamics and of humans movements for assisting and cooperating autonomous systems.

Neural and Probabilistic Robotics

Neural and Probabilistic Robotics

Neural models have incredible learning and modeling capabilities which was demonstrated in complex robot learning tasks (e.g., Martin Riedmiller’s or Sergey Levine’s work). While these results are promising we lack a theoretical understanding of the learning capabilities of such networks and it is unclear how learned features and models can be reused or exploited in other tasks.

The ai-lab investigates deep neural network implementations that are theoretical grounded in the framework of probabilistic inference and develops deep transfer learning strategies for stochastic neural networks. We evaluate our models in challenging robotics applications where the networks have to scale to high-dimensional control signals and need to generate reactive feedback command in real-time.

Our developments will enable complex online adaptation and skill learning behavior in autonomous systems and will help to gain a better understanding of the meaning and function of the learned features in large neural networks with millions of parameters.