UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Non-Vacuous Generalisation Bounds for Shallow Neural Networks

Biggs, Felix; Guedj, Benjamin; (2022) Non-Vacuous Generalisation Bounds for Shallow Neural Networks. In: Proceedings of the 39th International Conference on Machine Learning. (pp. pp. 1963-1981). MLResearchPress Green open access

[thumbnail of Guedj_biggs22a.pdf]
Preview
Text
Guedj_biggs22a.pdf

Download (562kB) | Preview

Abstract

We focus on a specific class of shallow neural networks with a single hidden layer, namely those with L2-normalised data and either a sigmoid-shaped Gaussian error function (“erf”) activation or a Gaussian Error Linear Unit (GELU) activation. For these networks, we derive new generalisation bounds through the PAC-Bayesian theory; unlike most existing such bounds they apply to neural networks with deterministic rather than randomised parameters. Our bounds are empirically non-vacuous when the network is trained with vanilla stochastic gradient descent on MNIST and Fashion-MNIST.

Type: Proceedings paper
Title: Non-Vacuous Generalisation Bounds for Shallow Neural Networks
Event: 39th International Conference on Machine Learning
Open access status: An open access version is available from UCL Discovery
Publisher version: https://proceedings.mlr.press/v162/biggs22a.html
Language: English
Additional information: This version is the version of record. For information on re-use, please refer to the publisher’s terms and conditions.
UCL classification: UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL
URI: https://discovery.ucl.ac.uk/id/eprint/10143073
Downloads since deposit
15Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item