UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Linear regression and the normality assumption

Schmidt, AF; Finan, C; (2018) Linear regression and the normality assumption. Journal of Clinical Epidemiology , 98 pp. 146-151. 10.1016/j.jclinepi.2017.12.006. Green open access

[thumbnail of Article]
Preview
Text (Article)
Schmidt_UCL_depos_JCE2018.pdf - Accepted Version

Download (671kB) | Preview
[thumbnail of Supplementary data]
Preview
Text (Supplementary data)
Schmidt_AFSCHMIDT Appendix Revision.pdf - Accepted Version

Download (526kB) | Preview

Abstract

Objectives: Researchers often perform arbitrary outcome transformations to fulfill the normality assumption of a linear regression model. This commentary explains and illustrates that in large data settings, such transformations are often unnecessary, and worse may bias model estimates. / Study Design and Setting: Linear regression assumptions are illustrated using simulated data and an empirical example on the relation between time since type 2 diabetes diagnosis and glycated hemoglobin levels. Simulation results were evaluated on coverage; i.e., the number of times the 95% confidence interval included the true slope coefficient. / Results: Although outcome transformations bias point estimates, violations of the normality assumption in linear regression analyses do not. The normality assumption is necessary to unbiasedly estimate standard errors, and hence confidence intervals and P-values. However, in large sample sizes (e.g., where the number of observations per variable is >10) violations of this normality assumption often do not noticeably impact results. Contrary to this, assumptions on, the parametric model, absence of extreme observations, homoscedasticity, and independency of the errors, remain influential even in large sample size settings. / Conclusion: Given that modern healthcare research typically includes thousands of subjects focusing on the normality assumption is often unnecessary, does not guarantee valid results, and worse may bias estimates due to the practice of outcome transformations.

Type: Article
Title: Linear regression and the normality assumption
Open access status: An open access version is available from UCL Discovery
DOI: 10.1016/j.jclinepi.2017.12.006
Publisher version: https://doi.org/10.1016/j.jclinepi.2017.12.006
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords: Epidemiological methods, Bias, Linear regression, Modeling assumptions, Statistical inference, Big data
UCL classification: UCL
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > Institute of Cardiovascular Science
UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > Institute of Cardiovascular Science > Population Science and Experimental Medicine
URI: https://discovery.ucl.ac.uk/id/eprint/10070182
Downloads since deposit
17,280Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item