Combining Primitive DQNs for Improved Reinforcement Learning in Minecraft

Matthew Reynard, Herman Kamper, Herman A Engelbrecht, Benjamin Rosman

January 2020

Abstract

We ask whether a reinforcement learning agent learns better by first learning the skills to perform smaller tasks in a complex environment or by learning the skills in the complex environment from the start. This is investigated empirically in a non-trivial game environment. We use the premise of curriculum learning where an agent learns different skills in independent and isolated environments referred to as dojos in this paper. The skills learned in the dojos are then used as different actions as the agent decides which skill to perform that best applies to the current game state. We evaluate this with experiments conducted in the Minecraft gaming environment. We find that dojo learning is able to achieve better performance with faster training time in certain environments. The main benefit of this approach is that the reward functions can be finely tuned in the dojos for each action as compared to the traditional methods. However, the skills learned in the individual dojos become the limiting factor in performance as the agent is unable to combine these skills effectively when put in certain complex environments. This can be mitigated if the dojo modules are further trained to result in similar results as the standard method.

Type

Conference paper

Publication

International SAUPEC/RobMech/PRASA Conference

Reinforcement Learning Curriculum Learning

Combining Primitive DQNs for Improved Reinforcement Learning in Minecraft

Abstract

Benjamin Rosman

Lab Director