Learning to Model What Matters – Representations and World Models for Efficient Reinforcement Learning

Abstract

Reinforcement learning (RL) is one of the most promising pathway towards building decision-making systems that can learn from their own successes and mistakes. However, despite their potential, RL agents often struggle to learn complex tasks, proving too inefficient, both in terms of samples and computational resources, and unstable in practice. To enable RL-based agents to live up to their potential, we need to address these limitation.

To this end, we take a close look as the mechanisms that lead to unstable and inefficient value function learning with neural networks. Learned value functions overestimate true returns during training, and this overestimation is linked to unstable learning in the feature representation layers of neural networks. To counteract this, we show the need for proper normalization of learned value approximations. Building on this insight, we then investigate model-based auxiliary tasks to stabilize feature learning further. We find that model-based self-prediction, in combination with value learning, leads to stable features.

Moving beyond feature learning, we investigate decision-aware model learning. We find that, similar to the issues encountered in representation learning, tying model updates to the value function can lead to unstable and even diverging model learning. This problem can be mitigated in observation-space models by using the value function gradient to measure its sensitivity with regard to model errors. We then move on to combine our insights into representation learning and model learning. We discuss the family of value-aware model learning algorithms and show how to extend its losses to account for learning with stochastic models. Finally, we show that combining all previous insights in a unified architecture can lead to stable and efficient value function learning.

Details

Business indexing term

Subject:

Artificial intelligence

Subject

Computer science;
Artificial intelligence

Classification

0984: Computer science
0800: Artificial intelligence

Identifier / keyword

Machine learning; Reinforcement learning; World models

Title

Learning to Model What Matters – Representations and World Models for Efficient Reinforcement Learning

Author

Voelcker, Claas Alexander

Number of pages

223

Publication year

2025

Degree date

2025

School code

0779

Source

DAI-B 87/5(E), Dissertation Abstracts International

ISBN

9798265437457

Advisor

Farahmand, Amir-massoud; Gilitschenski, Igor

Committee member

Cunningham, William

University/institution

University of Toronto (Canada)

Department

Computer Science

University location

Canada -- Ontario, CA

Degree

Ph.D.

Source type

Dissertation or Thesis

Language

English

Document type

Dissertation/Thesis

Dissertation/thesis number

32169388

ProQuest document ID

3276209807

Document URL

https://www.proquest.com/dissertations-theses/learning-model-what-matters-representations-world/docview/3276209807/se-2?accountid=208611

Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.

Database

ProQuest One Academic

Learning to Model What Matters – Representations and World Models for Efficient Reinforcement Learning

Content area

Abstract

Details