Neural signatures of model-based and model-free reinforcement learning across prefrontal cortex and striatum

Abstract

Animals integrate knowledge about how the state of the environment evolves to choose actions that maximise reward. Such goal-directed behaviour - or model-based (MB) reinforcement learning (RL) - can flexibly adapt choice to changes, being thus distinct from simpler habitual - or model-free (MF) RL - strategies. Previous inactivation and neuroimaging work implicates prefrontal cortex (PFC) and the caudate striatal region in MB-RL; however, details are scarce about its implementation at the single-neuron level. Here, we recorded from two PFC regions - the dorsal anterior cingulate cortex (ACC) and dorsolateral PFC (DLPFC), and two striatal regions, caudate and putamen - while two rhesus macaques performed a sequential decision-making (two-step) task in which MB-RL involves knowledge about the statistics of reward and state transitions. All four regions, but particularly the ACC, encoded the rewards received and tracked the probabilistic state transitions that occurred. However, ACC (and to a lesser extent caudate) encoded the key variables of the task - namely the interaction between reward, transition and choice - which underlies MB decision-making. ACC and caudate neurons also encoded MB-derived estimates of choice values. Moreover, caudate value estimates of the choice options flipped when a rare transition occurred, demonstrating value update based on structural knowledge of the task. The striatal regions were unique (relative to PFC) in encoding the current and previous rewards with opposing polarities, reminiscent of dopaminergic neurons, and indicative of a MF prediction error. Our findings provide a deeper understanding of selective and temporally dissociable neural mechanisms underlying goal-directed behaviour.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

* The manuscript has been refined with various revisions such as clarifications on the explore/exploit analysis, additional explanation of visual confounds in the transition structure, and clearer integration of putamen findings. We have also ensured all previously missing figures and relevant statistics are now included and that all figure legends are fully annotated. The revised manuscript is stronger and more accessible as a result. Importantly, none of the revisions altered the central conclusions of our study, which provide novel evidence from non-human primate single-unit recordings that model-based (MB) and model-free (MF) reinforcement learning processes are dissociably encoded across distinct prefrontal and striatal circuits.

Funder Information Declared

Wellcome Trust, 096689/Z/11/Z, 220296/Z/20/Z, 219525/Z/19/Z, 214314/Z/18/Z

Biotechnology and Biological Sciences Research Council, https://ror.org/00cwqg982, BB/W003392/1

Fundação para a Ciência e Tecnologia, SFRH/BD/51711/2011

Santa Casa da Misericórdia de Lisboa, Premio João Lobo Antunes 2017

Rosetrees Trust, https://ror.org/04e3zg361

Gatsby Initiative for Brain Development and Psychiatry, GAT3955

Jean Francois and Marie-Laure de Clermont Tonerre Foundation

Max Planck Society, https://ror.org/01hhn8329

Alexander von Humboldt Foundation, https://ror.org/012kf4317

Details

Subject

Mental task performance;
Dopamine receptors;
Putamen;
Neuroimaging;
Cortex (cingulate);
Decision making;
Reinforcement;
Neostriatum;
Choice learning;
Prefrontal cortex

URL

https://www.biorxiv.org/content/10.1101/2025.01.11.632388v2

Title

Neural signatures of model-based and model-free reinforcement learning across prefrontal cortex and striatum

Author

Miranda, Bruno; Butler, James L; W M Nishantha Malalasekera; Behrens, Timothy E J; Dayan, Peter; Kennerley, Steven W

Publication title

bioRxiv; Cold Spring Harbor

Publication year

2026

Publication date

Jan 18, 2026

Section

New Results

Publisher

Cold Spring Harbor Laboratory Press

Source

BioRxiv

Place of publication

Cold Spring Harbor

Country of publication

United States

University/institution

Cold Spring Harbor Laboratory Press

Publication subject

Biology

ISSN

2692-8205

Source type

Working Paper

Language of publication

English

Document type

Working Paper

Publication history

Milestone dates

2025-01-12 (Version 1)

DOI

https://doi.org/10.1101/2025.01.11.632388

ProQuest document ID

3154515771

Document URL

https://www.proquest.com/working-papers/neural-signatures-model-based-free-reinforcement/docview/3154515771/se-2?accountid=208611

Full text outside of ProQuest

https://www.biorxiv.org/content/10.1101/2025.01.11.632388v2

© 2026. This article is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (“the License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Last updated

2026-01-19

Database

ProQuest One Academic

Neural signatures of model-based and model-free reinforcement learning across prefrontal cortex and striatum

Abstract

Details

Full text options

Suggested sources

Search with indexing terms

Subject

Neural signatures of model-based and model-free reinforcement learning across prefrontal cortex and striatum

Content area

Abstract

Details

Full text options

Suggested sources

Search with indexing terms

Subject