Composition and Zero-Shot Transfer with Lattice Structures in Reinforcement Learning

Abstract

An important property of long-lived agents is the ability to reuse existing knowledge to solve new tasks. An appealing approach towards obtaining such agents is by leveraging logical composition over tasks, where new tasks are defined by applying logic operators to previously-solved ones. This composition is particularly powerful since it provides a human-understandable mechanism for task specification. However, no unifying formalism for applying logic operators to tasks and generalising combinatorially over them has yet been developed. We address the problem by formally defining logical composition as operators acting on a set of tasks in a lattice structure—the algebraic structure that generalises the study of Boolean logic. This provides a theoretically rigorous method for composing tasks, allowing us to formulate new tasks in terms of the negation, disjunction, and conjunction of a set of base tasks. We prove that by learning a new type of goal-oriented value function model free, called the world value function, an agent can solve composite tasks involving arbitrary logical operators with no further learning. We verify our approach in high-dimensional domains—including a video game environment and continuous-control task—where an agent first learns to solve a set of base tasks, and then composes these solutions to solve a super-exponential number of new tasks.

Publication
Journal of Artificial Intelligence Research
Geraud Nangue Tasse
Geraud Nangue Tasse
Lecturer

I am interested in reinforcement learning (RL) since it is the subfield of machine learning with the most potential for achieving AGI.

Steven James
Steven James
Deputy Lab Director

My research interests include reinforcement learning and planning.

Benjamin Rosman
Benjamin Rosman
Lab Director

I am a Professor in the School of Computer Science and Applied Mathematics at the University of the Witwatersrand in Johannesburg. I work in robotics, artificial intelligence, decision theory and machine learning.