UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Possible Principles for Aligned Structure Learning Agents

Da Costa, Lancelot; Gavenčiak, Tomáš; Hyland, David; Samiei, Mandana; Dragos-Manta, Cristian; Pattisapu, Candice; Razi, Adeel; (2025) Possible Principles for Aligned Structure Learning Agents. Neural Computation pp. 1-47. 10.1162/NECO.a.39. (In press).

[thumbnail of Alignment .pdf] Text
Alignment .pdf
Access restricted to UCL open access staff until 15 February 2026.

Download (4MB)

Abstract

This paper offers a road map for the development of scalable aligned artificial intelligence (AI) from first principle descriptions of natural intelligence. In brief, a possible path toward scalable aligned AI rests on enabling artificial agents to learn a good model of the world that includes a good model of our preferences. For this, the main objective is creating agents that learn to represent the world and other agents’ world models, a problem that falls under structure learning (also known as causal representation learning or model discovery). We expose the structure learning and alignment problems with this goal in mind, as well as principles to guide us forward, synthesizing various ideas across mathematics, statistics, and cognitive science. We discuss the essential role of core knowledge, information geometry, and model reduction in structure learning and suggest core structural modules to learn a wide range of naturalistic worlds. We then outline a way toward aligned agents through structure learning and theory of mind. As an illustrative example, we mathematically sketch Asimov’s laws of robotics, which prescribe agents to act cautiously to minimize the ill-being of other agents. We supplement this example by proposing refined approaches to alignment. These observations may guide the development of artificial intelligence in helping to scale existing, or design new, aligned structure learning systems.

Type: Article
Title: Possible Principles for Aligned Structure Learning Agents
Location: United States
DOI: 10.1162/NECO.a.39
Publisher version: https://doi.org/10.1162/neco.a.39
Language: English
Additional information: This version is the version of record. For information on re-use, please refer to the publisher’s terms and conditions.
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > UCL Queen Square Institute of Neurology
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > UCL Queen Square Institute of Neurology > Imaging Neuroscience
URI: https://discovery.ucl.ac.uk/id/eprint/10218576
Downloads since deposit
3Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item