Biggs, Felix;
Guedj, Benjamin;
(2022)
Non-Vacuous Generalisation Bounds for Shallow Neural Networks.
In:
Proceedings of the 39th International Conference on Machine Learning.
(pp. pp. 1963-1981).
MLResearchPress
Preview |
Text
Guedj_biggs22a.pdf Download (562kB) | Preview |
Abstract
We focus on a specific class of shallow neural networks with a single hidden layer, namely those with L2-normalised data and either a sigmoid-shaped Gaussian error function (“erf”) activation or a Gaussian Error Linear Unit (GELU) activation. For these networks, we derive new generalisation bounds through the PAC-Bayesian theory; unlike most existing such bounds they apply to neural networks with deterministic rather than randomised parameters. Our bounds are empirically non-vacuous when the network is trained with vanilla stochastic gradient descent on MNIST and Fashion-MNIST.
Type: | Proceedings paper |
---|---|
Title: | Non-Vacuous Generalisation Bounds for Shallow Neural Networks |
Event: | 39th International Conference on Machine Learning |
Open access status: | An open access version is available from UCL Discovery |
Publisher version: | https://proceedings.mlr.press/v162/biggs22a.html |
Language: | English |
Additional information: | This version is the version of record. For information on re-use, please refer to the publisher’s terms and conditions. |
UCL classification: | UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science UCL > Provost and Vice Provost Offices > UCL BEAMS UCL |
URI: | https://discovery.ucl.ac.uk/id/eprint/10143073 |




Archive Staff Only
![]() |
View Item |