UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Canonical Correlation Analysis and Partial Least Squares for identifying brain-behaviour associations: a tutorial and a comparative study

Mihalik, Agoston; Chapman, James; Adams, Rick A; Winter, Nils R; Ferreira, Fabio S; Shawe-Taylor, John; Mourão-Miranda, Janaina; (2022) Canonical Correlation Analysis and Partial Least Squares for identifying brain-behaviour associations: a tutorial and a comparative study. Biological Psychiatry: Cognitive Neuroscience and Neuroimaging , 7 (11) pp. 1055-1067. 10.1016/j.bpsc.2022.07.012. Green open access

[thumbnail of Canonical Correlation Analysis.pdf]
Preview
Text
Canonical Correlation Analysis.pdf

Download (2MB) | Preview

Abstract

Canonical Correlation Analysis (CCA) and Partial Least Squares (PLS) are powerful multivariate methods for capturing associations across two modalities of data (e.g., brain and behaviour). However, when the sample size is similar or smaller than the number of variables in the data, CCA and PLS models may overfit, i.e., find spurious associations that generalise poorly to new data. Dimensionality reduction and regularized extensions of CCA and PLS have been proposed to address this problem, yet most studies using these approaches have some limitations. This work gives a theoretical and practical introduction into the most common CCA/PLS models and their regularized variants. We examine the limitations of standard CCA and PLS when the sample size is similar or smaller than the number of variables. We discuss how dimensionality reduction and regularization techniques address this problem and explain their main advantages and disadvantages. We highlight crucial aspects of the CCA/PLS analysis framework, including optimising the hyperparameters of the model and testing the identified associations for statistical significance. We apply the described CCA/PLS models to simulated data and real data from the Human Connectome Project and the Alzheimer's Disease Neuroimaging Initiative (both of n>500). We use both low and high dimensionality versions of each data (i.e., ratios between sample size and variables in the range of ∼1-10 and ∼0.1-0.01) to demonstrate the impact of data dimensionality on the models. Finally, we summarize the key lessons of the tutorial.

Type: Article
Title: Canonical Correlation Analysis and Partial Least Squares for identifying brain-behaviour associations: a tutorial and a comparative study
Location: United States
Open access status: An open access version is available from UCL Discovery
DOI: 10.1016/j.bpsc.2022.07.012
Publisher version: https://doi.org/10.1016/j.bpsc.2022.07.012
Language: English
Additional information: © 2022 Published by Elsevier Ltd. This is an open access article under the CC BY 4.0 license Attribution 4.0 International (https://creativecommons.org/licenses/by/4.0/)
Keywords: CCA, PLS, brain-behaviour association, high-dimensional data, overfitting, regularization
UCL classification: UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > Division of Psychiatry > Mental Health Neuroscience
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Brain Sciences > Division of Psychiatry
URI: https://discovery.ucl.ac.uk/id/eprint/10153939
Downloads since deposit
121Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item