image_pdfimage_print

Open Project, MSc. or BSc. Thesis – Multimodal Human-Autonomous Agents Interaction Using Pre-trained Language and Visual Foundation Models

Supervisor: Linus Nwankwo, M.Sc.;
Univ.-Prof. Dr Elmar Rückert
Start date:  As soon as possible

 

Theoretical difficulty: mid
Practical difficulty: High

Abstract

In this project or thesis, we aim to enhance the method proposed in [1] for robust natural human-autonomous agent interaction through verbal and textual conversations. 

The primary focus would be to develop a system that can enhance the natural language conversations, understand the 

semantic  context of the robot’s task environment, and abstract this information into actionable commands or queries. This will be achieved by leveraging the capabilities of pre-trained large language models (LLMs) – GPT-4, visual language models (VLMs) – CLIP, and audio language models (ALMs) – AudioLM.

Tentative Work Plan

To achieve the objectives, the following concrete tasks will be focused on:

  • Initialisation and Background:
    • Study the concept of LLMs, VLMs, and ALMs.
    • How LLMs, VLMs, and ALMs can be grounded for autonomous robotic tasks.
    • Familiarise yourself with the methods at the project website – https://linusnep.github.io/MTCC-IRoNL/.
    •  
  • Setup and Familiarity with the Simulation Environment
    • Build a robot model (URDF) for the simulation (optional if you wish to use the existing one).
    • Set up the ROS framework for the simulation (Gazebo, Rviz).
    • Recommended programming tools: C++, Python, Matlab.
    •  
  • Coding
    • Improve the existing code of the method proposed in [1] to incorporate the aforementioned modalities—the code to be provided to the student.
    • Integrate other LLMs e.g., LLaMA and VLMs e.g., GLIP modalities into the framework and compare their performance with the baseline (GPT-4 and CLIP).
    •  
  • Intermediate Presentation:
    • Present the results of your background study or what you must have done so far.
    • Detailed planning of the next steps.
    •  
  • Simulation & Real-World Testing (If Possible):
    • Test your implemented model with a Gazebo-simulated quadruped or differential drive robot.
    • Perform the real-world testing of the developed framework with our Unitree Go1 quadruped robot or with our Segway RMP 220 Lite robot.
    • Analyse and compare the model’s performance in real-world scenarios versus simulations with the different LLMs and VLMs pipelines.
    •  
  • Optimize the Framework for Optimal Performance and Efficiency (Optional):
    • Validate the model to identify bottlenecks within the robot’s task environment.
    •  
  • Documentation and Thesis Writing:
    • Document the entire process, methodologies, and tools used.
    • Analyse and interpret the results.
    • Draft the project report or thesis, ensuring that the primary objectives are achieved.
    •  
  • Research Paper Writing (optional)
    •  

Related Work

[1]  Linus Nwankwo and Elmar Rueckert. 2024. The Conversation is the Command: Interacting with Real-World Autonomous Robots Through Natural Language. In Companion of the 2024 ACM/IEEE International Conference on Human-Robot Interaction (HRI ’24). Association for Computing Machinery, New York, NY, USA, 808–812. https://doi.org/10.1145/3610978.3640723.

[2]  Nwankwo, L., & Rueckert, E. (2024). Multimodal Human-Autonomous Agents Interaction Using Pre-Trained Language and Visual Foundation ModelsarXiv preprint arXiv:2403.12273.

B.Sc. Thesis – Philipp Zeni – Precision in Motion: ML-Enhanced Race Course Identification for Formula Student Racing

Supervisor: Linus Nwankwo, M.Sc.;
Univ.-Prof. Dr Elmar Rückert
Start date: 30th October 2023

 

Theoretical difficulty: mid
Practical difficulty: High

Abstract

This thesis explores machine learning techniques for analysing onboard recordings from the TU Graz Racing Team, a prominent Formula Student team. The main goal is to design and train an end-to-end machine learning model to autonomously discern race courses based on sensor observations.

Further, this thesis seeks to address the following research questions:

  • Can track markers (cones) be reliably detected and segmented from onboard recordings?
  • Does the delineated racing track provide an adequate level of accuracy to support autonomous driving, minimizing the risk of accidents?
  • How well does a neural network trained on simulated data adapt to real-world situations?
  • Can the neural network ensure real-time processing in high-speed scenarios surpassing 100 km/h?

Tentative Work Plan

To achieve the objectives, the following concrete tasks will be focused on:

  • Thesis initialisation and literature review:
    • Define the scope and boundaries of your work.
    • Study the existing project in [1]  and [2] to identify gaps and methodologies.
    •  
  • Setup and familiarize with the simulation environment
    • Build the car model (URDF) for the simulation (optional if you wish to use the existing one)
    • Setup the ROS framework for the simulation (Gazebo, Rviz)
    • Recommended programming tools: C++, Python, Matlab
    •  
  • Data acquisition and preprocessing (3D Lidar and RGB-D data)
    • Collect onboard recordings and sensor data from the TU Graz Racing track.
    • Augment the data with additional simulated recordings using ROS, if necessary.
    • Preprocess and label the data for machine learning (ML). This includes segmenting tracks, markers, and other relevant features.
    •  
  • Intermediate presentation:
    • Present the results of the literature study or what has been done so far
    • Detailed planning of the next steps
    •  
  • ML Model Development:
    •  Design the initial neural network architecture.
    • Train the model using the preprocessed data.
    • Evaluate model performance using metrics like accuracy, precision, recall, etc.
    • Iteratively refine the model based on the evaluation results.
    •  
  • Real-world Testing (If Possible):
    • Implement the trained model on a real vehicle’s onboard computer.
    • Test the vehicle in a controlled environment, ensuring safety measures are in place.
    • Analyze and compare the model’s performance in real-world scenarios versus simulations.
    •  
  • Optimization for Speed and Efficiency (Optional):
    • Validate the model to identify bottlenecks.
    • Optimize the neural network for real-time performance, especially for high-speed scenarios
    •  
  • Documentation and B.Sc. thesis writing:
    • Document the entire process, methodologies, and tools used.
    • Analyze and interpret the results.
    • Draft the thesis, ensuring that at least two of the research questions are addressed.
    •  
  • Research paper writing (optional)
    •  

Related Work

[1]   Autonomous Racing Graz, “Enhanced localisation for autonomous racing with high-resolution lidar“, Article by Tom Grey, Visited 30.10.2023.

[2]   Autonomous RC car racing ETH Zürich, “The ORCA (Optimal RC Racing) Project“, Article by Alex Liniger, Visited 30.10.2023.

[3]   P. Cai, H. Wang, H. Huang, Y. Liu and M. Liu, “Vision-Based Autonomous Car Racing Using Deep Imitative Reinforcement Learning,” in IEEE Robotics and Automation Letters, vol. 6, no. 4, pp. 7262-7269, Oct. 2021, doi: 10.1109/LRA.2021.3097345.

[4]   Z. Lu, C. Zhang, H. Zhang, Z. Wang, C. Huang and Y. Ji, “Deep Reinforcement Learning Based Autonomous Racing Car Control With Priori Knowledge,” 2021 China Automation Congress (CAC), Beijing, China, 2021, pp. 2241-2246, doi: 10.1109/CAC53003.2021.9728289.

[5]   J. Kabzan, L. Hewing, A. Liniger and M. N. Zeilinger, “Learning-Based Model Predictive Control for Autonomous Racing,” in IEEE Robotics and Automation Letters, vol. 4, no. 4, pp. 3363-3370, Oct. 2019, doi: 10.1109/LRA.2019.2926677.

3D perception and SLAM using geometric and semantic information for mine inspection with quadruped robot

Supervisor: Linus Nwankwo, M.Sc.;
Univ.-Prof. Dr Elmar Rückert
Start date: As soon as possible

 

Theoretical difficulty: mid
Practical difficulty: high

Abstract

Unlike the traditional mine inspection approach, which is inefficient in terms of time, terrain, and coverage, this project/thesis aims to investigate novel 3D perception and SLAM using geometric and semantic information for real-time mine inspection.

We propose to develop a SLAM approach that takes into account the terrain of the mining site and the sensor characteristics to ensure complete coverage of the environment while minimizing traversal time.

Tentative Work Plan

To achieve our objective, the following concrete tasks will be focused on:

  • Study the concept of 3D perception and SLAM for mine inspection, as well as algorithm development, system integration and real-world demonstration using Unitree Go1 quadrupedal robot.

  • Setup and familiarize with the simulation environment:
    • Build the robot model (URDF) for the simulation (optional if you wish to use the existing one)
    • Setup the ROS framework for the simulation (Gazebo, Rviz)
    • Recommended programming tools: C++, Python, Matlab
  • Develop a novel SLAM system for the quadrupedal robot to navigate, map and interact with challenging real-world environments:
    • 2D/3D mapping in complex indoor/outdoor environments

    • Localization using either Monte Carlo or extended Kalman filter

    • Complete coverage path-planning

  • Intermediate presentation:
    • Presenting the results of the literature study
    • Possibility to ask questions about the theoretical background
    • Detailed planning of the next steps
  • Implementation:

    • Simulate the achieved results in a virtual environment (Gazebo, Rviz, etc.)

    • Real-time testing on Unitree Go1 quadrupedal robot.

  • Evaluate the performance in various challenging real-world environments, including outdoor terrains, urban environments, and indoor environments with complex structures.
  • M.Sc. thesis or research paper writing (optional)

Related Work

[1]  Wolfram Burgard, Cyrill Stachniss, Kai Arras, and Maren Bennewitz , ‘SLAM: Simultaneous
Localization and Mapping’,  http://ais.informatik.uni-freiburg.de/teaching/ss12/robotics/slides/12-slam.pdf

[2]  V.Barrile, G. Candela, A. Fotia, ‘Point cloud segmentation using image processing techniques for structural analysis’, The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-2/W11, 2019 

[3]  Łukasz Sobczak , Katarzyna Filus , Adam Domanski and Joanna Domanska, ‘LiDAR Point Cloud Generation for SLAM Algorithm Evaluation’, Sensors 2021, 21, 3313. https://doi.org/10.3390/ s21103313.

M.Sc. Project – Mekonoude Etienne Kpanou: Mobile robot teleoperation based on human finger direction and vision

Theoretical difficulty: mid
Practical difficulty: mid

Naturally, humans have the ability to give directions (go front, back, right, left etc) by merely pointing fingers towards the direction in question. This can be done effortlessly without saying a word. However, mimicking or training a mobile robot to understand such gestures is still today an open problem to solve.
In the context of this thesis, we propose finger-pose based mobile robot navigation to maximize natural human-robot interaction. This could be achieved by observing the human fingers’ Cartesian  pose from an

RGB-D camera and translating it to the robot’s linear and angular velocity commands. For this, we will leverage computer vision algorithms and the ROS framework to achieve the objectives.
The prerequisite for this project are basic understanding of Python or C++ programming, OpenCV and ROS.

Tentative work plan

In the course of this thesis, the following concrete tasks will be focused on:

  • study the concept of visual navigation of mobile robots
  • develop a hand detection and tracking algorithm in Python or C++
  • apply the developed algorithm to navigate a simulated mobile robot
  • real-time experimentation
  • thesis writing

References

  1. Shuang Li, Jiaxi Jiang, Philipp Ruppel, Hongzhuo Liang, Xiaojian Ma,
    Norman Hendrich, Fuchun Sun, Jianwei Zhang,  “A Mobile Robot Hand-Arm Teleoperation System by Vision and IMU“,  IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS),  October 25-29, 2020, Las Vegas, NV, USA.
  2.  

M.Sc. Thesis – Stefan Maintinger: Map-based and map-less mobile navigation in crowded dynamic environments

Supervisor: Linus Nwankwo, M.Sc.;
Vedant Dave M.Sc.;
Univ.-Prof. Dr Elmar Rückert
Start date: 1st June 2023

 

Theoretical difficulty: mid
Practical difficulty: mid

Abstract

For more than two decades now, the technique of simultaneous localization and mapping (SLAM) has served as a cornerstone in achieving goals related to autonomous navigation.

The core essence of the SLAM problem lies in the creation of an environmental map while concurrently estimating the robot’s relative position to this map. This task is undertaken with the aid of sensor observations and control data, both of which are subject to noise.

In recent times, a shift towards a mapless-based approach employing deep reinforcement learning has emerged. In this innovative methodology, the agent, in this case a robot, learns the navigation policy. This learning process is driven solely by sensor data and control data, effectively bypassing the need for a prior map of the task environment. In the scope of this dissertation, we will conduct a comprehensive performance evaluation of both the traditional SLAM and the emerging mapless-based approach. We’ll utilize a dynamic, crowded environment as our test bed, and the open-source open-shuttle mobile robot with a differential drive will serve as our experimental subject.

Tentative Work Plan

To achieve our objective, the following concrete tasks will be focused on:

  • Literature research and field familiarization
    • Mobile robotics and industrial use cases
    • Overview of map-based autonomous navigation (SLAM & Path planning)
    • Overview of mapless-based autonomous navigation approach with deep reinforcement learning
  • Setup and familiarize with the simulation environment
    • Build the robot model (URDF) for the simulation (optional if you wish to use the existing one)
    • Setup the ROS framework for the simulation (Gazebo, Rviz)
    • Recommended programming tools: C++, Python, Matlab
  • Intermediate presentation:
    • Presenting the results of the literature study
    • Possibility to ask questions about the theoretical background
    • Detailed planning of the next steps
  • Define key performance/quality metrics for evaluation:
    • Time to reach the desired goal
    • Average/mean speed
    • Path smoothness
    • Obstacle avoidance/distance to obstacles
    • Computational requirement
    • Success rate
    • Other relevant parameters
  • Assessment and execution:
    • Compare the results from both map-based and map-less approaches on the above-defined evaluation metrics.
  • Validation:
    • Validate both approaches in a real-world scenario using our open-source open-shuttle mobile robot.
  • Furthermore, the following optional goals are planned:
    • Develop a hybrid approach combining both the map-based and the map-less methods.
  • M.Sc. thesis writing
  • Research paper writing (optional)

Related Work

[1] Xue, Honghu; Hein, Benedikt; Bakr, Mohamed; Schildbach, Georg; Abel, Bengt; Rueckert, Elmar, Using Deep Reinforcement Learning with Automatic Curriculum Learning for Mapless Navigation in Intralogistics“, In: Applied Sciences (MDPI), Special Issue on Intelligent Robotics, 2022.

[2] Han HuKaicheng ZhangAaron Hao TanMichael RuanChristopher AgiaGoldie Nejat “Sim-to-Real Pipeline for Deep Reinforcement Learning for Autonomous Robot Navigation in Cluttered Rough Terrain”,  IEEE Robotics and Automation Letters ( Volume: 6, Issue: 4, October 2021).

[3] Md. A. K. NiloyAnika ShamaRipon K. ChakraborttyMichael J. RyanFaisal R. BadalZ. TasneemMd H. AhamedS. I. Mo, “Critical Design and Control Issues of Indoor Autonomous Mobile Robots: A Review”, IEEE Access ( Volume: 9), February 2021.

[4]  Ning Wang, Yabiao Wang, Yuming Zhao, Yong Wang and Zhigang Li , “Sim-to-Real: Mapless Navigation for USVs Using Deep Reinforcement Learning”, Journal of Marine Science and Engineering, 2022, 10, 895. https://doi.org/10.3390/jmse10070895

13.03.2023 – Innovative Research Discussion

Meeting notes on the 13th of March, 2023

Location: Chair of CPS

Date & Time: 13th March, 2023, 11:45 pm to 12:45 pm

Participants: Univ.-Prof. Dr. Elmar Rueckert, Linus Nwankwo, M.Sc.

 

Agenda

  1. General Discussion
  2. Discussion on research progress
  3. Next action

General Discussion

  1.  The applied machine and deep learning course start on 02.10.2023.
  2.  Study the publication [1] for the next work.

Do next

  1. 3D complete coverage on quadruped robot for mine inspection tasks.
  2. Prior in SLAM from architectural floor plans.

Reference

[1]    Song, Soohwan & Kim, Daekyum & Jo, Sungho. (2020). Online coverage and inspection planning for 3D modelling. Autonomous Robots. 44. 10.1007/s10514-020-09936-7.

Gabriel Brinkmann

Bachelor Thesis Student at the Montanuniversität Leoben

Google_Scholar_logo.svg

Short bio: Gabriel is a Bachelor Student in Mechanical Engineering at Montanuniversität Leoben and, as of March 2023, is writing his Bachelors thesis at the Chair of Cyber-Physical Systems.

Research Interests

  • Robotics

Thesis

Contact

Gabriel Brinkmann
Master Thesis Student at the Chair of Cyber-Physical-Systems
Montanuniversität Leoben
Franz-Josef-Straße 18, 
8700 Leoben, Austria 

Email:   

B.Sc. Thesis – Gabriel Brinkmann: Simultaneous localization and mapping (SLAM) with a quadrupedal robot in challenging real-world environments

Supervisor: Linus Nwankwo, M.Sc.;
Univ.-Prof. Dr Elmar Rückert
Start date: 5th September 2022

 

Theoretical difficulty: mid
Practical difficulty: mid

Abstract

When observing animals in nature, navigation and walking seem like medial-side tasks. However, training robots to effectively achieve the same objective is still a challenging problem for roboticists and researchers. We aim to autonomously perform tasks like navigating traffic, avoiding obstacles, finding optimal routes, surveying human hazardous areas, etc with a quadrupedal robot. These tasks are useful in commercial, industrial, and military settings, including self-driving cars, warehouse stacking robots, container transport vehicles in ports, and load-bearing companions for military operations.

For over 20 years today, the SLAM approach has been widely used to achieve autonomous navigation, obstacle avoidance, and path planning objectives. SLAM is a crucial problem in robotics, where a robot navigates through an unknown environment while simultaneously creating a map of it. The SLAM problem is challenging as it requires the robot to estimate its pose (position and orientation) relative to the environment and simultaneously estimate the location of landmarks in the environment.

Some of the most common challenges with SLAM are the accumulation of localization errors over time, inaccurate pose estimation on a map, loop closure, etc. These problems have been partly overcome by using Pose Graphs for localization errors, Extended Kalman filters and Monte Carlos localization for pose estimation.

Quadrupedal robots are well-suited for challenging environments, where the surface conditions are non-uniform, e.g. in off-road environments or in warehouses where stairs or obstacles have to be overcome but have the difficulty of non-uniform dynamic movement which poses additional difficulty for SLAM. 

In the context of this thesis, we propose to study the concept of SLAM with its associated algorithms and apply it to a quadrupedal robot (Unitree Go1). Our goal is to provide the robot with certain tasks and commands that it will then have to autonomously execute. For example, navigate rooms, avoid slow-moving objects, follow an object (person), etc.

 

Tentative Work Plan

To achieve our objective, the following concrete tasks will be focused on:

  • Study the concept of SLAM as well as its application in quadrupedal robots.

  • Setup and familiarize with the simulation environment:
    • Build the robot model (URDF) for the simulation (optional if you wish to use the existing one)
    • Setup the ROS framework for the simulation (Gazebo, Rviz)
    • Recommended programming tools: C++, Python, Matlab
    •  
  • Develop a novel SLAM system for a quadrupedal robot to navigate and map challenging real-world environments:
    • 2D/3D mapping in complex indoor/outdoor environments

    • Localization using either Monte Carlo or extended Kalman filter

    • Establish a path-planning algorithm

  • Intermediate presentation:
    • Presenting the results of the literature study
    • Possibility to ask questions about the theoretical background
    • Detailed planning of the next steps
    •  
  • Implementation:

    • Simulate the achieved results in a virtual environment (Gazebo, Rviz, etc.)

    • Real-time testing on Unitree Go1 quadrupedal robot.

  • Evaluate the performance in various challenging real-world environments, including outdoor terrains, urban environments, and indoor environments with complex structures.
  • B.Sc. thesis writing.
  • Research paper writing (optional)

Related Work

[1]  Wolfram Burgard, Cyrill Stachniss, Kai Arras, and Maren Bennewitz , ‘SLAM: Simultaneous
Localization and Mapping’,  http://ais.informatik.uni-freiburg.de/teaching/ss12/robotics/slides/12-slam.pdf

[2]  V.Barrile, G. Candela, A. Fotia, ‘Point cloud segmentation using image processing techniques for structural analysis’, The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLII-2/W11, 2019 

[3]  Łukasz Sobczak , Katarzyna Filus , Adam Domanski and Joanna Domanska, ‘LiDAR Point Cloud Generation for SLAM Algorithm Evaluation’, Sensors 2021, 21, 3313. https://doi.org/10.3390/ s21103313.

10.11.2022 – Innovative Research Discussion

Meeting notes on the 20th of October, 2022

Location: Chair of CPS

Date & Time: 10th November, 2022, 1:30 pm to 2:28 pm

Participants: Univ.-Prof. Dr. Elmar Rueckert, Linus Nwankwo, M.Sc.

 

Agenda

  1. General Discussion
  2. Discussion on the data set from Privatklinik Graz
  3. Next action

General Discussion

  1.  New cards are added to the Deck app, check the ones that required actions.
  2.  There will be a ROS2 meeting with Nils by 5 pm on 11.11.2022.

Data-set from Privitklinik Graz

  1. Reproduce the failed experiments.

  2. Evaluate the S-PTAM and ORB-SLAM visual SLAM algorithms on the recorded data.

  3. Evaluate Hector SLAM and GMapping algorithms on the recorded dataset(s) with a limited field of view of the lidar data.

    • Remove 90° in the frontal direction
    • Remove 120° in the frontal direction
    • Remove 90 or 120° in the frontal and in the backwards direction

Do next

  1. Implement a filter node that filters out the noise from the data.
  2. Build the map from the hospital’s building plan (bp)

20.10.2022 – Innovative Research Discussion

Meeting notes on the 20th of October, 2022

Location: Chair of CPS

Date & Time: 20th October, 2022, 12:35 pm to 1:38 pm

Participants: Univ.-Prof. Dr. Elmar Rueckert, Linus Nwankwo, M.Sc.

 

Agenda

  1. General Discussion
  2. Update on Conference Paper
  3. Do-It-Lab

General Discussion

  1.  Add dates to all the meeting notes
  2. Add publication to the home page
  3. Contact Christopher for his presentation date and update our calendar accordingly
  4. Use the Deck app to communicate updates of current, completed and yet-to-be-done tasks

Update on Conference Paper

  1.  Compare map-based and map-less indoor SLAM methods.
  2. Focus on indoor navigation.
  3. Evaluate 1 – 3 lidar-based, 1 – 3 visual-based, and 1- 3 deep learning SLAM methods.
  4. Pick up some ideas from the referenced papers in the Deck app.

Do-It-Lab

  1. Organise the students into four groups
  2.  Give the students the questionnaire after the lab to fill out and submit. You could generate a barcode using the web link to the form.