Data in the robotics domain is noisy, high-dimensional, continuous and expensive to generate, which makes learning skills difficult. Given these constraints, we would like robots to be able to learn skills for solving different tasks. The challenge here is being able to do so without high sample complexity.
My current work aims to solve this problem by learning action priors – a distribution of useful actions taken by expert agents across all states (robot configurations). The hope is that transferring these action priors between tasks will reduce the exploration space (space of possible actions for a given state), thus reducing sample complexity.
My current focus is learning action priors that are conditioned on observations (sensor data). The hope is that this makes it possible to learn action priors that generalize to states with similar features (states where similar sensor data is observed).