Jarvis, Devon;
Lee, Sebastian;
Domine, Clementine;
Saxe, andrew;
Sarao Mannelli, Stefano;
(2025)
A Theory of Initialisation's Impact on Specialisation.
In:
Proceedings of the ICLR 2025 Conference.
(pp. pp. 1-29).
ICLR
Preview |
Text
Domine_A Theory of Initialisation's Impact on Specialisation_VoR.pdf Download (1MB) | Preview |
Abstract
Prior work has demonstrated a consistent tendency in neural networks engaged in continual learning tasks, wherein intermediate task similarity results in the highest levels of catastrophic interference. This phenomenon is attributed to the network's tendency to reuse learned features across tasks. However, this explanation heavily relies on the premise that neuron specialisation occurs, i.e. the emergence of localised representations. Our investigation challenges the validity of this assumption. Using theoretical frameworks for the analysis of neural networks, we show a strong dependence of specialisation on the initial condition. More precisely, we show that weight imbalance and high weight entropy can favour specialised solutions. We then apply these insights in the context of continual learning, first showing the emergence of a monotonic relation between task-similarity and forgetting in non-specialised networks. Finally, we show that specialization by weight imbalance is beneficial on the commonly employed elastic weight consolidation regularisation technique.
Type: | Proceedings paper |
---|---|
Title: | A Theory of Initialisation's Impact on Specialisation |
Event: | The Thirteenth International Conference on Learning Representations |
Open access status: | An open access version is available from UCL Discovery |
Publisher version: | https://openreview.net/forum?id=RQz7szbVDs |
Language: | English |
Additional information: | © The Authors 2025. Original content in this paper is licensed under the terms of the Creative Commons Attribution 4.0 International (CC BY 4.0) Licence (https://creativecommons.org/licenses/by/4.0/). |
Keywords: | machine learning theory, teacher student setup, initialisation, specialisation, statitistical mechanics of learning |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Gatsby Computational Neurosci Unit |
URI: | https://discovery.ucl.ac.uk/id/eprint/10205844 |




Archive Staff Only
![]() |
View Item |