Content area

Abstract

Reinforcement learning (RL) is one of the most promising pathway towards building decision-making systems that can learn from their own successes and mistakes. However, despite their potential, RL agents often struggle to learn complex tasks, proving too inefficient, both in terms of samples and computational resources, and unstable in practice. To enable RL-based agents to live up to their potential, we need to address these limitation.

To this end, we take a close look as the mechanisms that lead to unstable and inefficient value function learning with neural networks. Learned value functions overestimate true returns during training, and this overestimation is linked to unstable learning in the feature representation layers of neural networks. To counteract this, we show the need for proper normalization of learned value approximations. Building on this insight, we then investigate model-based auxiliary tasks to stabilize feature learning further. We find that model-based self-prediction, in combination with value learning, leads to stable features.

Moving beyond feature learning, we investigate decision-aware model learning. We find that, similar to the issues encountered in representation learning, tying model updates to the value function can lead to unstable and even diverging model learning. This problem can be mitigated in observation-space models by using the value function gradient to measure its sensitivity with regard to model errors. We then move on to combine our insights into representation learning and model learning. We discuss the family of value-aware model learning algorithms and show how to extend its losses to account for learning with stochastic models. Finally, we show that combining all previous insights in a unified architecture can lead to stable and efficient value function learning.

Details

1010268
Business indexing term
Title
Learning to Model What Matters – Representations and World Models for Efficient Reinforcement Learning
Number of pages
223
Publication year
2025
Degree date
2025
School code
0779
Source
DAI-B 87/5(E), Dissertation Abstracts International
ISBN
9798265437457
Committee member
Cunningham, William
University/institution
University of Toronto (Canada)
Department
Computer Science
University location
Canada -- Ontario, CA
Degree
Ph.D.
Source type
Dissertation or Thesis
Language
English
Document type
Dissertation/Thesis
Dissertation/thesis number
32169388
ProQuest document ID
3276209807
Document URL
https://www.proquest.com/dissertations-theses/learning-model-what-matters-representations-world/docview/3276209807/se-2?accountid=208611
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Database
ProQuest One Academic