Context-based Online Policy Instantiation for Multiple Tasks and Changing Environments

Abstract

This paper addresses the problem of online decision making in continually changing and complex environments, with inherent incompleteness in models of change. A fully general version of this problem is intractable but many interesting domains are rendered manageable by the fact that all instances of a task can be generated from a finite set of qualitatively meaningful contexts. We present an approach to online decision making that exploits this decomposability in a two part procedure. In a task independent exploratory process, our algorithm running on an autonomous agent learns the set of structural landmark contexts which compose its domain, and reduces this set through the use of the symmetry structure of permutation groups. To each reduced landmark we then associate a set of policies independent of global context. This enables an efficient online policy instantiation process that composes from already learnt policy templates. This is illustrated on a spatial navigation domain where the learning agent is shown to be able to play a pursuit-evasion game in random environments with unknown dynamic obstacles.

Publication
Proceedings of the 2014 PRASA, RobMech and AfLaT International Joint Symposium
Benjamin Rosman
Benjamin Rosman
Lab Director

I am a Professor in the School of Computer Science and Applied Mathematics at the University of the Witwatersrand in Johannesburg. I work in robotics, artificial intelligence, decision theory and machine learning.