Learning-based End-to-End Autonomous Navigation for Planetary Rovers Considering Non-Geometric Locomotion Hazards

1State Key Laboratory of Robotics and Systems, Harbin Institute of Technology, 2Department of Aerospace Engineering, Toronto Metropolitan University

Abstract

Autonomous navigation plays an increasingly crucial role in rover-based planetary missions. End-to-end navigation approaches developed upon deep reinforcement learning have been a rising research topic with great adaptability in complex environments. However, most existing works focus on geometric obstacle avoidance thus have limited capability to cope with ubiquitous non-geometric hazards in the wild, such as slipping and sinking, caused by insufficient mechanical properties of the terrain. Autonomous navigation in unstructured, harsh environments remains a great challenge requiring further in-depth study.

In this paper, a DRL-based navigation method is proposed to autonomously guide a planetary rover towards goals via hazard-free paths with low wheel slip ratios. We introduce an end-to-end network architecture, in which the visual perception and the wheel-terrain interaction are fused to learn the representation of terrain mechanical properties implicitly and further facilitate policy learning for non-geometric hazard avoidance. Our approach outperforms baseline methods in simulation evaluation with superior avoidance capabilities against geometric obstacles and non-geometric hazards. Experiments conducted at a Mars emulation site suggest the successful deployment of our approach on a planetary rover and the capacity of dealing with locomotion risks in real-world navigation tasks.

Video

Method

We propose to incorporate both proprioceptive and visual information for locomotion tasks using a novel Transformer model, LocoTransformer. Our model consists of the following two components: (i) Separate modality encoders for proprioceptive and visual inputs that project both modalities into a latent feature space; (ii) A shared Transformer encoder that performs cross-modality attention over proprioceptive features and visual features, as well as spatial attention over visual tokens to predict actions and predict values.

Results

We evaluate our method in simulation and the real world. In simulation, we simulate a six-wheeled rover prototype in the MarsSim simulator, which can provide high-fidelity physical and visual simulations for rovers and generate various terrains with different mechanical properties. We then deployed the autonomous navigation model to a three-wheeled rover prototype and conducted experiments in a Mars emulation site to further display the practical performance of our method.

Evaluations in simulation

Scene 1

Scene 2

Experiments in real-world scenarios

Scene 1

Scene 2

Scene 3

Scene 4