Facilitating Safe Sim-to-Real through Simulator Abstraction and Zero-shot Task Composition

Abstract

Simulators are a fundamental part of training robots to solve complex control and navigation tasks. This is due to the speed and safety they offer in comparison to training directly on a physical system, where exploration may drive the system towards dangerous action for itself and its environment. However, simulators have a fundamental drawback known as the reality gap, which describes the discrepancy in performance which occurs when a robot trained in simulation performs the same task in the real world. The reality gap is prohibitive as it means many of the most powerful recent advances in reinforcement learning (RL) cannot be used with robots due to their high sample complexity which makes physical training infeasible. In this work we introduce a framework for applying high sample complexity RL algorithms to robots by leveraging recent advances in hierarchical RL and skill composition. We demonstrate that adapting hierarchical RL techniques allows us to close the reality gap at multiple levels of abstraction. As a result we are able to train a robot to perform combinatorially many tasks within a domain with minimal training on a physical system or steps of error correction. We believe this work provides an important starting framework for applying hierarchical RL to perform sim-to-real generalisation at multiple levels of abstraction.

Publication
In Lifelong Learning of High-level Cognitive and Reasoning Skills Workshop @ IROS 2022
Tamlin Love
Tamlin Love
PhD Student

I am a PhD student at the Institut de Robotica i Informàtica Industrial (IRI) (under CSIC and UPC) in Barcelona, working on the TRAIL Marie Skłodowska-Curie Doctoral Network under the supervision of Guillem Alenyà. I was previously an MSc student and lecturer at the University of the Witwatersrand, under the supervision of Benjamin Rosman and Ritesh Ajoodha, as well as a member of the RAIL Lab.

Devon Jarvis
Devon Jarvis
Associate Lecturer

I am a PhD candidate and Associate Lecturer at Wits interested in studying systematic generalization and the emergence of modularity in the brain and machines.

Geraud Nangue Tasse
Geraud Nangue Tasse
Associate Lecturer

I am an IBM PhD fellow interested in reinforcement learning (RL) since it is the subfield of machine learning with the most potential for achieving AGI.

Branden Ingram
Branden Ingram
Lecturer

I am primarily interested in AI for games.

Steven James
Steven James
Deputy Lab Director

My research interests include reinforcement learning and planning.

Benjamin Rosman
Benjamin Rosman
Lab Director

I am a Professor in the School of Computer Science and Applied Mathematics at the University of the Witwatersrand in Johannesburg. I work in robotics, artificial intelligence, decision theory and machine learning.