UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

High-dimensional changepoint estimation with heterogeneous missingness

Follain, B; Wang, T; Samworth, RJ; (2022) High-dimensional changepoint estimation with heterogeneous missingness. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , 84 (3) pp. 1023-1055. 10.1111/rssb.12540. Green open access

[thumbnail of Wang_Journal of the Royal Statistical Society  Series B  Statistical Methodology - 2022 - Follain - High‐dimensional changepoint.pdf]
Preview
Text
Wang_Journal of the Royal Statistical Society Series B Statistical Methodology - 2022 - Follain - High‐dimensional changepoint.pdf

Download (1MB) | Preview

Abstract

We propose a new method for changepoint estimation in partially observed, high-dimensional time series that undergo a simultaneous change in mean in a sparse subset of coordinates. Our first methodological contribution is to introduce a ‘MissCUSUM’ transformation (a generalisation of the popular cumulative sum statistics), that captures the interaction between the signal strength and the level of missingness in each coordinate. In order to borrow strength across the coordinates, we propose to project these MissCUSUM statistics along a direction found as the solution to a penalised optimisation problem tailored to the specific sparsity structure. The changepoint can then be estimated as the location of the peak of the absolute value of the projected univariate series. In a model that allows different missingness probabilities in different component series, we identify that the key interaction between the missingness and the signal is a weighted sum of squares of the signal change in each coordinate, with weights given by the observation probabilities. More specifically, we prove that the angle between the estimated and oracle projection directions, as well as the changepoint location error, are controlled with high probability by the sum of two terms, both involving this weighted sum of squares, and representing the error incurred due to noise and the error due to missingness respectively. A lower bound confirms that our changepoint estimator, which we call MissInspect, is optimal up to a logarithmic factor. The striking effectiveness of the MissInspect methodology is further demonstrated both on simulated data, and on an oceanographic data set covering the Neogene period.

Type: Article
Title: High-dimensional changepoint estimation with heterogeneous missingness
Open access status: An open access version is available from UCL Discovery
DOI: 10.1111/rssb.12540
Publisher version: https://doi.org/10.1111/rssb.12540
Language: English
Additional information: © 2022 The Authors. Journal of the Royal Statistical Society: Series B (Statistical Methodology)) published by John Wiley & Sons Ltd on behalf of Royal Statistical Society. This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
Keywords: Changepoint estimation, high-dimensional data, missing data, segmentation, sparsity
UCL classification: UCL
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Maths and Physical Sciences > Dept of Statistical Science
URI: https://discovery.ucl.ac.uk/id/eprint/10132755
Downloads since deposit
20Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item