Reinforcement Learning or Active Inference?

Advanced search
Browse by:

Department | Year

UCL Theses | Latest

Deposit your research

Reinforcement Learning or Active Inference?

Friston, KJ; Daunizeau, J; Kiebel, SJ; (2009) Reinforcement Learning or Active Inference? PLOS ONE , 4 (7) , Article e6421. 10.1371/journal.pone.0006421. Green open access

Preview

PDF
112833.pdf
Download (918kB)

Abstract

This paper questions the need for reinforcement learning or control theory when optimising behaviour. We show that it is fairly simple to teach an agent complicated and adaptive behaviours using a free-energy formulation of perception. In this formulation, agents adjust their internal states and sampling of the environment to minimize their free-energy. Such agents learn causal structure in the environment and sample it in an adaptive and self-supervised fashion. This results in behavioural policies that reproduce those optimised by reinforcement learning and dynamic programming. Critically, we do not need to invoke the notion of reward, value or utility. We illustrate these points by solving a benchmark problem in dynamic programming; namely the mountain-car problem, using active perception or inference under the free-energy principle. The ensuing proof-of-concept may be important because the free-energy formulation furnishes a unified account of both action and perception and may speak to a reappraisal of the role of dopamine in the brain.

Type:	Article
Title:	Reinforcement Learning or Active Inference?
Open access status:	An open access version is available from UCL Discovery
DOI:	10.1371/journal.pone.0006421
Publisher version:	http://dx.doi.org/10.1371/journal.pone.0006421
Language:	English
Additional information:	© 2009 Friston et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. The Wellcome Trust: Grant#: WT056750; Modelling functional brain architecture. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Keywords:	FREE-ENERGY, BAYESIAN-INFERENCE, DYNAMIC-SYSTEMS, VISUAL-CORTEX, MODELS, DOPAMINE, REWARD, BRAIN, PREDICTION, RESPONSES
UCL classification:	UCL UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > UCL Queen Square Institute of Neurology UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > UCL Queen Square Institute of Neurology > Imaging Neuroscience
URI:	https://discovery.ucl.ac.uk/id/eprint/112833

Downloads since deposit

124Downloads

Download activity - last month

Download activity - last 12 months

Downloads by country - last 12 months

Archive Staff Only

View Item