Da Costa, Lancelot;
Gavenčiak, Tomáš;
Hyland, David;
Samiei, Mandana;
Dragos-Manta, Cristian;
Pattisapu, Candice;
Razi, Adeel;
(2025)
Possible Principles for Aligned Structure Learning Agents.
Neural Computation
pp. 1-47.
10.1162/NECO.a.39.
(In press).
|
Text
Alignment .pdf Access restricted to UCL open access staff until 15 February 2026. Download (4MB) |
Abstract
This paper offers a road map for the development of scalable aligned artificial intelligence (AI) from first principle descriptions of natural intelligence. In brief, a possible path toward scalable aligned AI rests on enabling artificial agents to learn a good model of the world that includes a good model of our preferences. For this, the main objective is creating agents that learn to represent the world and other agents’ world models, a problem that falls under structure learning (also known as causal representation learning or model discovery). We expose the structure learning and alignment problems with this goal in mind, as well as principles to guide us forward, synthesizing various ideas across mathematics, statistics, and cognitive science. We discuss the essential role of core knowledge, information geometry, and model reduction in structure learning and suggest core structural modules to learn a wide range of naturalistic worlds. We then outline a way toward aligned agents through structure learning and theory of mind. As an illustrative example, we mathematically sketch Asimov’s laws of robotics, which prescribe agents to act cautiously to minimize the ill-being of other agents. We supplement this example by proposing refined approaches to alignment. These observations may guide the development of artificial intelligence in helping to scale existing, or design new, aligned structure learning systems.
| Type: | Article |
|---|---|
| Title: | Possible Principles for Aligned Structure Learning Agents |
| Location: | United States |
| DOI: | 10.1162/NECO.a.39 |
| Publisher version: | https://doi.org/10.1162/neco.a.39 |
| Language: | English |
| Additional information: | This version is the version of record. For information on re-use, please refer to the publisher’s terms and conditions. |
| UCL classification: | UCL UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > UCL Queen Square Institute of Neurology UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > UCL Queen Square Institute of Neurology > Imaging Neuroscience |
| URI: | https://discovery.ucl.ac.uk/id/eprint/10218576 |
Archive Staff Only
![]() |
View Item |

