Combining Primitive DQNs for Improved Reinforcement Learning in Minecraft

Abstract

We ask whether a reinforcement learning agent learns better by first learning the skills to perform smaller tasks in a complex environment or by learning the skills in the complex environment from the start. This is investigated empirically in a non-trivial game environment. We use the premise of curriculum learning where an agent learns different skills in independent and isolated environments referred to as dojos in this paper. The skills learned in the dojos are then used as different actions as the agent decides which skill to perform that best applies to the current game state. We evaluate this with experiments conducted in the Minecraft gaming environment. We find that dojo learning is able to achieve better performance with faster training time in certain environments. The main benefit of this approach is that the reward functions can be finely tuned in the dojos for each action as compared to the traditional methods. However, the skills learned in the individual dojos become the limiting factor in performance as the agent is unable to combine these skills effectively when put in certain complex environments. This can be mitigated if the dojo modules are further trained to result in similar results as the standard method.

Publication
International SAUPEC/RobMech/PRASA Conference
Benjamin Rosman
Benjamin Rosman
Lab Director

I am a Professor in the School of Computer Science and Applied Mathematics at the University of the Witwatersrand in Johannesburg. I work in robotics, artificial intelligence, decision theory and machine learning.