When Is There a Representer Theorem? Vector Versus Matrix Regularizers.
J MACH LEARN RES
2507 - 2529.
We consider a general class of regularization methods which learn a vector of parameters on the basis of linear measurements. It is well known that if the regularizer is a nondecreasing function of the L-2 norm, then the learned vector is a linear combination of the input data. This result, known as the representer theorem, lies at the basis of kernel-based methods in machine learning. In this paper, we prove the necessity of the above condition, in the case of differentiable regularizers. We further extend our analysis to regularization methods which learn a matrix, a problem which is motivated by the application to multi-task learning. In this context, we study a more general representer theorem, which holds for a larger class of regularizers. We provide a necessary and sufficient condition characterizing this class of matrix regularizers and we highlight some concrete examples of practical importance. Our analysis uses basic principles from matrix theory, especially the useful notion of matrix nondecreasing functions.
|Title:||When Is There a Representer Theorem? Vector Versus Matrix Regularizers|
|Open access status:||An open access publication|
|Keywords:||kernel methods, matrix learning, minimal norm interpolation, multi-task learning, regularization, MULTIPLE TASKS, KERNEL METHODS, REGRESSION, NETWORKS|
|UCL classification:||UCL > School of BEAMS
UCL > School of BEAMS > Faculty of Engineering Science
Archive Staff Only