UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

The effect of sample size on polygenic hazard models for prostate cancer

Karunamuni, RA; Huynh-Le, M-P; Fan, CC; Eeles, RA; Easton, DF; Kote-Jarai, Z; Amin Al Olama, A; ... PRACTICAL Consortium; + view all (2020) The effect of sample size on polygenic hazard models for prostate cancer. European Journal of Human Genetics , 28 pp. 1467-1475. 10.1038/s41431-020-0664-2. Green open access

[thumbnail of Pashayan_Effect of sample size on polygenic hazard models for PCa.pdf]
Preview
Text
Pashayan_Effect of sample size on polygenic hazard models for PCa.pdf - Accepted Version

Download (183kB) | Preview

Abstract

We determined the effect of sample size on performance of polygenic hazard score (PHS) models in prostate cancer. Age and genotypes were obtained for 40,861 men from the PRACTICAL consortium. The dataset included 201,590 SNPs per subject, and was split into training and testing sets. Established-SNP models considered 65 SNPs that had been previously associated with prostate cancer. Discovery-SNP models used stepwise selection to identify new SNPs. The performance of each PHS model was calculated for random sizes of the training set. The performance of a representative Established-SNP model was estimated for random sizes of the testing set. Mean HR98/50 (hazard ratio of top 2% to average in test set) of the Established-SNP model increased from 1.73 [95% CI: 1.69-1.77] to 2.41 [2.40-2.43] when the number of training samples was increased from 1 thousand to 30 thousand. Corresponding HR98/50 of the Discovery-SNP model increased from 1.05 [0.93-1.18] to 2.19 [2.16-2.23]. HR98/50 of a representative Established-SNP model using testing set sample sizes of 0.6 thousand and 6 thousand observations were 1.78 [1.70-1.85] and 1.73 [1.71-1.76], respectively. We estimate that a study population of 20 thousand men is required to develop Discovery-SNP PHS models while 10 thousand men should be sufficient for Established-SNP models.

Type: Article
Title: The effect of sample size on polygenic hazard models for prostate cancer
Location: England
Open access status: An open access version is available from UCL Discovery
DOI: 10.1038/s41431-020-0664-2
Publisher version: http://dx.doi.org/10.1038/s41431-020-0664-2
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords: Genetics research, Risk factors
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > Institute of Epidemiology and Health
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > Institute of Epidemiology and Health > Applied Health Research
URI: https://discovery.ucl.ac.uk/id/eprint/10101322
Downloads since deposit
39Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item