Ægidius, Sebastian;
Hadjivelichkov, Dennis;
Jiao, Jianhao;
Embley-Riches, Jonathan;
Kanoulas, Dimitrios;
(2025)
Watch Your STEPP: Semantic Traversability Estimation Using Pose Projected Features.
In:
2025 IEEE International Conference on Robotics and Automation (ICRA).
(pp. pp. 2376-2382).
IEEE: Atlanta, GA, USA.
Preview |
Text
2501.17594v1.pdf - Accepted Version Download (34MB) | Preview |
Abstract
Understanding the traversability of terrain is essential for autonomous robot navigation, particularly in unstructured environments such as natural landscapes. Although traditional methods, such as occupancy mapping, provide a basic framework, they often fail to account for the complex mobility capabilities of some platforms such as legged robots. In this work, we propose a method for estimating terrain traversability by learning from demonstrations of human walking. Our approach leverages dense, pixel-wise feature embeddings generated using the DINOv2 vision Transformer model, which are processed through an encoder-decoder MLP architecture to analyze terrain segments. The averaged feature vectors, extracted from the masked regions of interest, are used to train the model in a reconstruction-based framework. By minimizing reconstruction loss, the network distinguishes between familiar terrain with a low reconstruction error and unfamiliar or hazardous terrain with a higher reconstruction error. This approach facilitates the detection of anomalies, allowing a legged robot to navigate more effectively through challenging terrain. We run real-world experiments on the ANYmal legged robot both indoor and outdoor to prove our proposed method. The code is open-source, while video demonstrations can be found on our website: https://rpl-cs-ucl.github.io/STEPP/
Type: | Proceedings paper |
---|---|
Title: | Watch Your STEPP: Semantic Traversability Estimation Using Pose Projected Features |
Event: | IEEE International Conference on Robotics and Automation (ICRA) |
Dates: | 19 May 2025 - 23 May 2025 |
ISBN-13: | 979-8-3315-4139-2 |
Open access status: | An open access version is available from UCL Discovery |
DOI: | 10.1109/ICRA55743.2025.11127781 |
Publisher version: | https://doi.org/10.1109/icra55743.2025.11127781 |
Language: | English |
Additional information: | This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions. |
Keywords: | Legged locomotion; Computer vision; Codes; Navigation; Semantics; Estimation; Feature extraction; Transformers; Vectors; Videos |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > UCL BEAMS UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science |
URI: | https://discovery.ucl.ac.uk/id/eprint/10215504 |
Archive Staff Only
![]() |
View Item |