UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

A Theory of Initialisation's Impact on Specialisation

Jarvis, Devon; Lee, Sebastian; Domine, Clementine; Saxe, andrew; Sarao Mannelli, Stefano; (2025) A Theory of Initialisation's Impact on Specialisation. In: Proceedings of the ICLR 2025 Conference. (pp. pp. 1-29). ICLR Green open access

[thumbnail of Domine_A Theory of Initialisation's Impact on Specialisation_VoR.pdf]
Preview
Text
Domine_A Theory of Initialisation's Impact on Specialisation_VoR.pdf

Download (1MB) | Preview

Abstract

Prior work has demonstrated a consistent tendency in neural networks engaged in continual learning tasks, wherein intermediate task similarity results in the highest levels of catastrophic interference. This phenomenon is attributed to the network's tendency to reuse learned features across tasks. However, this explanation heavily relies on the premise that neuron specialisation occurs, i.e. the emergence of localised representations. Our investigation challenges the validity of this assumption. Using theoretical frameworks for the analysis of neural networks, we show a strong dependence of specialisation on the initial condition. More precisely, we show that weight imbalance and high weight entropy can favour specialised solutions. We then apply these insights in the context of continual learning, first showing the emergence of a monotonic relation between task-similarity and forgetting in non-specialised networks. Finally, we show that specialization by weight imbalance is beneficial on the commonly employed elastic weight consolidation regularisation technique.

Type: Proceedings paper
Title: A Theory of Initialisation's Impact on Specialisation
Event: The Thirteenth International Conference on Learning Representations
Open access status: An open access version is available from UCL Discovery
Publisher version: https://openreview.net/forum?id=RQz7szbVDs
Language: English
Additional information: © The Authors 2025. Original content in this paper is licensed under the terms of the Creative Commons Attribution 4.0 International (CC BY 4.0) Licence (https://creativecommons.org/licenses/by/4.0/).
Keywords: machine learning theory, teacher student setup, initialisation, specialisation, statitistical mechanics of learning
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Gatsby Computational Neurosci Unit
URI: https://discovery.ucl.ac.uk/id/eprint/10205844
Downloads since deposit
Loading...
4Downloads
Download activity - last month
Loading...
Download activity - last 12 months
Loading...
Downloads by country - last 12 months
Loading...

Archive Staff Only

View Item View Item