UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Statistical and computational trade-offs in estimation of sparse principal components

Wang, T; Berthet, Q; Samworth, RJ; (2016) Statistical and computational trade-offs in estimation of sparse principal components. Annals of Statistics , 44 (5) pp. 1896-1930. 10.1214/15-AOS1369. Green open access

[thumbnail of Wang_euclid.aos.1473685263.pdf]
Preview
Text
Wang_euclid.aos.1473685263.pdf - Published Version

Download (509kB) | Preview

Abstract

In recent years, sparse principal component analysis has emerged as an extremely popular dimension reduction technique for high-dimensional data. The theoretical challenge, in the simplest case, is to estimate the leading eigenvector of a population covariance matrix under the assumption that this eigenvector is sparse. An impressive range of estimators have been proposed; some of these are fast to compute, while others are known to achieve the minimax optimal rate over certain Gaussian or sub-Gaussian classes. In this paper, we show that, under a widely-believed assumption from computational complexity theory, there is a fundamental trade-off between statistical and computational performance in this problem. More precisely, working with new, larger classes satisfying a restricted covariance concentration condition, we show that there is an effective sample size regime in which no randomised polynomial time algorithm can achieve the minimax optimal rate. We also study the theoretical performance of a (polynomial time) variant of the well-known semidefinite relaxation estimator, revealing a subtle interplay between statistical and computational efficiency.

Type: Article
Title: Statistical and computational trade-offs in estimation of sparse principal components
Open access status: An open access version is available from UCL Discovery
DOI: 10.1214/15-AOS1369
Publisher version: http://dx.doi.org/10.1214/15-AOS1369
Language: English
Additional information: This version is the version of record. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords: Science & Technology, Physical Sciences, Statistics & Probability, Mathematics, Computational lower bounds, planted clique problem, polynomial time algorithm, sparse principal component analysis, HIGH-DIMENSIONAL DATA, ADAPTIVE ESTIMATION, POWER METHOD, PCA, EIGENVALUE, MINIMIZATION, RELAXATIONS, CONSISTENCY, OPERATORS, CLIQUES
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences > Dept of Statistical Science
URI: https://discovery.ucl.ac.uk/id/eprint/10055407
Downloads since deposit
79Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item