UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Generalization Bound of Gradient Descent for Non-Convex Metric Learning

Dong, M; Yang, X; Zhu, R; Wang, Y; Xue, J; (2020) Generalization Bound of Gradient Descent for Non-Convex Metric Learning. In: Advances in Neural Information Processing Systems 33 pre-proceedings (NeurIPS 2020). Neural Information Processing Systems Foundation (NeurIPS ): Virtual conference. (In press). Green open access

[thumbnail of MingzhiDong-NeurIPS2020-accepted.pdf]
Preview
Text
MingzhiDong-NeurIPS2020-accepted.pdf - Accepted Version

Download (1MB) | Preview

Abstract

Metric learning aims to learn a distance measure that can benefit distance-based methods such as the nearest neighbour (NN) classifier. While considerable efforts have been made to improve its empirical performance and analyze its generalization ability by focusing on the data structure and model complexity, an unresolved question is how choices of algorithmic parameters, such as the number of training iterations, affect metric learning as it is typically formulated as an optimization problem and nowadays more often as a non-convex problem. In this paper, we theoretically address this question and prove the agnostic Probably Approximately Correct (PAC) learnability for metric learning algorithms with non-convex objective functions optimized via gradient descent (GD); in particular, our theoretical guarantee takes the iteration number into account. We first show that the generalization PAC bound is a sufficient condition for agnostic PAC learnability and this bound can be obtained by ensuring the uniform convergence on a densely concentrated subset of the parameter space. We then show that, for classifiers optimized via GD, their generalizability can be guaranteed if the classifier and loss function are both Lipschitz smooth, and further improved by using fewer iterations. To illustrate and exploit the theoretical findings, we finally propose a novel metric learning method called Smooth Metric and representative Instance LEarning (SMILE), designed to satisfy the Lipschitz smoothness property and learned via GD with an early stopping mechanism for better discriminability and less computational cost of NN.

Type: Proceedings paper
Title: Generalization Bound of Gradient Descent for Non-Convex Metric Learning
Event: Thirty-fourth Conference on Neural Information Processing Systems (NeurIPS 2020)
Dates: 06 December 2020 - 12 December 2020
Open access status: An open access version is available from UCL Discovery
Publisher version: https://proceedings.neurips.cc/paper/2020/hash/6f5...
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences > Dept of Statistical Science
URI: https://discovery.ucl.ac.uk/id/eprint/10110887
Downloads since deposit
63Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item