Unsupervised Hierarchical Skill Discovery

Damion Harvey, Geraud Nangue Tasse, Benjamin Rosman, Branden Ingram, Steven James

July 2026

Abstract

We consider the problem of unsupervised skill segmentation and hierarchical structure discovery in reinforcement learning. While recent approaches have sought to segment trajectories into reusable skills or options, most rely on action labels, rewards, or handcrafted annotations, limiting their applicability. We propose a method that segments unlabelled trajectories into skills and induces a hierarchical structure over them using a grammar-based approach. The resulting hierarchy captures both low-level behaviours and their composition into higher-level skills. We evaluate our approach in high-dimensional, pixel-based environments, including Craftax and the full, unmodified version of Minecraft. Using metrics for skill segmentation, reuse, and hierarchy quality, we find that our method consistently produces more structured and semantically meaningful hierarchies than existing baselines. Furthermore, as a proof of concept, we demonstrate that these discovered hierarchies accelerate and stabilise learning on downstream reinforcement learning tasks.

Type

Conference paper

Publication

In International Conference on Machine Learning

Learning from Demonstration Hierarchical Reinforcement Learning

Unsupervised Hierarchical Skill Discovery

Abstract

Damion Harvey

Geraud Nangue Tasse

Lecturer

Benjamin Rosman

Lab Director

Branden Ingram

Lecturer

Steven James

Lab Director