Supervisors: Honghu Xue, Elmar Rückert
Finished: 22.April 2022
In this thesis, we propose an approach that combines reinforcement learning and automatic curriculum learning to solve a visual navigation task. A pedestrian agent is expected to learn a policy from scratch in a street-crossing scenario in a realistic traffic simulator CARLA. For this, the pedestrian is restricted to its first-person perspective as sensory input. The pedestrian cannot obtain full knowledge of the environment, which raises a partial observability challenge. To achieve this, an improved version of the Distributional Soft Actor-Critic algorithm is implemented. The algorithm adopts a newly proposed 3D dilated convolutional architecture to deal with the partial observability problem. To further improve its performance, we develop an automatic curriculum learning algorithm called NavACL+ on top of NavACL. As suggested in the results and ablation studies, our approach outperforms the original NavACL by 23.1%. Additionally, the convergence speed of NavACL+ is also observed to be 37.5% quicker. Moverover, the validation results show that the trained policies of NavACL+ are much more generalizable and robust than other variants in terms of different initial starting poses. NavACL+ policies perform 28.3% better than other policies training from a fixed start.