Neural computations underlying inverse reinforcement learning in the human brain

Creators: Collette, Sven; Pauli, Wolfgang M.; Bossaerts, Peter; O'Doherty, John

Abstract

In inverse reinforcement learning an observer infers the reward distribution available for actions in the environment solely through observing the actions implemented by another agent. To address whether this computational process is implemented in the human brain, participants underwent fMRI while learning about slot machines yielding hidden preferred and non-preferred food outcomes with varying probabilities, through observing the repeated slot choices of agents with similar and dissimilar food preferences. Using formal model comparison, we found that participants implemented inverse RL as opposed to a simple imitation strategy, in which the actions of the other agent are copied instead of inferring the underlying reward structure of the decision problem. Our computational fMRI analysis revealed that anterior dorsomedial prefrontal cortex encoded inferences about action-values within the value space of the agent as opposed to that of the observer, demonstrating that inverse RL is an abstract cognitive process divorceable from the values and concerns of the observer him/herself.

Additional Information

© 2017 Copyright Collette et al. This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited. Received: 19 June 2017; Accepted: 11 October 2017; Published: 30 October 2017. Data availability: The full anonymized dataset from this study is available in the NDAR data repository https://ndar.nih.gov/ under the collection ID 2417. Summary information on the data (e.g. additional details about the experiment such as picture files or exact timings of stimuli) is available on the NDA home-page without the need for an NDA account. To request access to detailed human subjects data, you must be sponsored by an NIH recognized institution with a Federalwide Assurance and have a research related need to access NDA data. Further information as to how to request access can be found here https://ndar.nih.gov/access.html. The fMRI activation maps are available at neurovault (http://neurovault.org/collections/ZZHNHAJU/). This work was supported by the NIMH Caltech Conte Center for the Neurobiology of Social Decision Making (JPO). We thank Tim Armstrong and Lynn K Paul for support with the participant recruitment, and Ralph E Lee and Julian M Tyszka for assistance with the experiments. The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication. The authors declare that no competing interests exist. Author contributions: Sven Collette, Conceptualization, Data curation, Software, Formal analysis, Validation, Investigation, Visualization, Methodology, Writing—original draft, Writing—review and editing; Wolfgang M Pauli, Investigation, Writing—review and editing; Peter Bossaerts, Resources, Software, Methodology, Writing—original draft, Writing—review and editing; John O'Doherty, Conceptualization, Supervision, Funding acquisition, Methodology, Writing—original draft, Writing—review and editing.

Attached Files

Published - elife-29718-v1.pdf

Supplemental Material - elife-29718-transrepform-v1.pdf

Files

elife-29718-transrepform-v1.pdf

Files (1.7 MB)

Name	Size	Download all
elife-29718-transrepform-v1.pdf md5:585f884dff11df01174e18d5754409fb	721.6 kB	Preview Download
elife-29718-v1.pdf md5:efa8230e05a704db614d49ef1ac30f3b	1.0 MB	Preview Download

Additional details

	All versions	This version
Views	0	0
Downloads	0	0
Data volume	0 Bytes	0 Bytes