Monti, RP;
Tootoonian, S;
Cao, R;
(2018)
Avoiding degradation in deep feed-forward networks by phasing out skip-connections.
In:
Proceedings of the International Conference on Artificial Neural Networks and Machine Learning : ICANN 2018.
(pp. pp. 447-456).
Springer: Rhodes, Greece.
Preview |
Text
samplepaper.pdf - Accepted Version Download (1MB) | Preview |
Abstract
A widely observed phenomenon in deep learning is the degradation problem: increasing the depth of a network leads to a decrease in performance on both test and training data. Novel architectures such as ResNets and Highway networks have addressed this issue by introducing various flavors of skip-connections or gating mechanisms. However, the degradation problem persists in the context of plain feed-forward networks. In this work we propose a simple method to address this issue. The proposed method poses the learning of weights in deep networks as a constrained optimization problem where the presence of skip-connections is penalized by Lagrange multipliers. This allows for skip-connections to be introduced during the early stages of training and subsequently phased out in a principled manner. We demonstrate the benefits of such an approach with experiments on MNIST, fashion-MNIST, CIFAR-10 and CIFAR-100 where the proposed method is shown to greatly decrease the degradation effect and is often competitive with ResNets.
Type: | Proceedings paper |
---|---|
Title: | Avoiding degradation in deep feed-forward networks by phasing out skip-connections |
Event: | International Conference on Artificial Neural Networks and Machine Learning : ICANN 2018 |
ISBN-13: | 9783030014230 |
Open access status: | An open access version is available from UCL Discovery |
DOI: | 10.1007/978-3-030-01424-7_44 |
Publisher version: | https://doi.org/10.1007/978-3-030-01424-7_44 |
Language: | English |
Additional information: | This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions. |
Keywords: | Degradation, Shattered/vanishing gradients, Skip-connections |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Life Sciences > Gatsby Computational Neurosci Unit |
URI: | https://discovery.ucl.ac.uk/id/eprint/10061069 |
Archive Staff Only
View Item |