Algorithm Development in Neural Networks: Insights from the Streaming Parity Task

Advanced search
Browse by:

Department | Year

UCL Theses | Latest

Deposit your research

Algorithm Development in Neural Networks: Insights from the Streaming Parity Task

Van Rossem, Loek; (2025) Algorithm Development in Neural Networks: Insights from the Streaming Parity Task. In: Proceedings of the 42nd International Conference on Machine Learning. PMLR: Vancouver, Canada. Green open access

Preview

Text
van-rossem25a.pdf - Published Version
Download (3MB) | Preview

Abstract

Even when massively overparameterized, deep neural networks show a remarkable ability to generalize. Research on this phenomenon has focused on generalization within distribution, via smooth interpolation. Yet in some settings neural networks also learn to extrapolate to data far beyond the bounds of the original training set, sometimes even allowing for infinite generalization, implying that an algorithm capable of solving the task has been learned. Here we undertake a case study of the learning dynamics of recurrent neural networks trained on the streaming parity task in order to develop an effective theory of algorithm development. The streaming parity task is a simple but nonlinear task defined on sequences up to arbitrary length. We show that, with sufficient finite training experience, RNNs exhibit a phase transition to perfect infinite generalization. Using an effective theory for the representational dynamics, we find an implicit representational merger effect which can be interpreted as the construction of a finite automaton that reproduces the task. Overall, our results disclose one mechanism by which neural networks can generalize infinitely from finite training experience.

Type:	Proceedings paper
Title:	Algorithm Development in Neural Networks: Insights from the Streaming Parity Task
Event:	ICML 2025: Forty-Second International Conference on Machine Learning
Open access status:	An open access version is available from UCL Discovery
Publisher version:	https://openreview.net/forum?id=3go0lhfxd0
Language:	English
Additional information:	This work is licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0).
Keywords:	Out-of-distribution generalization, Algorithm discovery, Deep learning theory, Mechanistic Interpretability
UCL classification:	UCL UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Gatsby Computational Neurosci Unit
URI:	https://discovery.ucl.ac.uk/id/eprint/10216304

Downloads since deposit

12Downloads

Download activity - last month

Download activity - last 12 months

Downloads by country - last 12 months

Archive Staff Only

View Item