Content area

Abstract

Animals integrate knowledge about how the state of the environment evolves to choose actions that maximise reward. Such goal-directed behaviour - or model-based (MB) reinforcement learning (RL) - can flexibly adapt choice to changes, being thus distinct from simpler habitual - or model-free (MF) RL - strategies. Previous inactivation and neuroimaging work implicates prefrontal cortex (PFC) and the caudate striatal region in MB-RL; however, details are scarce about its implementation at the single-neuron level. Here, we recorded from two PFC regions - the dorsal anterior cingulate cortex (ACC) and dorsolateral PFC (DLPFC), and two striatal regions, caudate and putamen - while two rhesus macaques performed a sequential decision-making (two-step) task in which MB-RL involves knowledge about the statistics of reward and state transitions. All four regions, but particularly the ACC, encoded the rewards received and tracked the probabilistic state transitions that occurred. However, ACC (and to a lesser extent caudate) encoded the key variables of the task - namely the interaction between reward, transition and choice - which underlies MB decision-making. ACC and caudate neurons also encoded MB-derived estimates of choice values. Moreover, caudate value estimates of the choice options flipped when a rare transition occurred, demonstrating value update based on structural knowledge of the task. The striatal regions were unique (relative to PFC) in encoding the current and previous rewards with opposing polarities, reminiscent of dopaminergic neurons, and indicative of a MF prediction error. Our findings provide a deeper understanding of selective and temporally dissociable neural mechanisms underlying goal-directed behaviour.

Competing Interest Statement

The authors have declared no competing interest.

Footnotes

* The manuscript has been refined with various revisions such as clarifications on the explore/exploit analysis, additional explanation of visual confounds in the transition structure, and clearer integration of putamen findings. We have also ensured all previously missing figures and relevant statistics are now included and that all figure legends are fully annotated. The revised manuscript is stronger and more accessible as a result. Importantly, none of the revisions altered the central conclusions of our study, which provide novel evidence from non-human primate single-unit recordings that model-based (MB) and model-free (MF) reinforcement learning processes are dissociably encoded across distinct prefrontal and striatal circuits.

Funder Information Declared

Wellcome Trust, 096689/Z/11/Z, 220296/Z/20/Z, 219525/Z/19/Z, 214314/Z/18/Z

Biotechnology and Biological Sciences Research Council, https://ror.org/00cwqg982, BB/W003392/1

Fundação para a Ciência e Tecnologia, SFRH/BD/51711/2011

Santa Casa da Misericórdia de Lisboa, Premio João Lobo Antunes 2017

Rosetrees Trust, https://ror.org/04e3zg361

Gatsby Initiative for Brain Development and Psychiatry, GAT3955

Jean Francois and Marie-Laure de Clermont Tonerre Foundation

Max Planck Society, https://ror.org/01hhn8329

Alexander von Humboldt Foundation, https://ror.org/012kf4317

Details

1009240
Title
Neural signatures of model-based and model-free reinforcement learning across prefrontal cortex and striatum
Publication title
bioRxiv; Cold Spring Harbor
Publication year
2026
Publication date
Jan 18, 2026
Section
New Results
Publisher
Cold Spring Harbor Laboratory Press
Source
BioRxiv
Place of publication
Cold Spring Harbor
Country of publication
United States
University/institution
Cold Spring Harbor Laboratory Press
Publication subject
ISSN
2692-8205
Source type
Working Paper
Language of publication
English
Document type
Working Paper
Publication history
 
 
Milestone dates
2025-01-12 (Version 1)
ProQuest document ID
3154515771
Document URL
https://www.proquest.com/working-papers/neural-signatures-model-based-free-reinforcement/docview/3154515771/se-2?accountid=208611
Copyright
© 2026. This article is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (“the License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Last updated
2026-01-19
Database
ProQuest One Academic