Linear regression and the normality assumption

Advanced search
Browse by:

Department | Year

UCL Theses | Latest

Deposit your research

Bookmark & Share

Linear regression and the normality assumption

Schmidt, AF; Finan, C; (2018) Linear regression and the normality assumption. Journal of Clinical Epidemiology , 98 pp. 146-151. 10.1016/j.jclinepi.2017.12.006. Green open access

Preview	Text (Article) Schmidt_UCL_depos_JCE2018.pdf - Accepted Version Download (671kB) \| Preview
Preview	Text (Supplementary data) Schmidt_AFSCHMIDT Appendix Revision.pdf - Accepted Version Download (526kB) \| Preview

Abstract

Objectives: Researchers often perform arbitrary outcome transformations to fulfill the normality assumption of a linear regression model. This commentary explains and illustrates that in large data settings, such transformations are often unnecessary, and worse may bias model estimates. / Study Design and Setting: Linear regression assumptions are illustrated using simulated data and an empirical example on the relation between time since type 2 diabetes diagnosis and glycated hemoglobin levels. Simulation results were evaluated on coverage; i.e., the number of times the 95% confidence interval included the true slope coefficient. / Results: Although outcome transformations bias point estimates, violations of the normality assumption in linear regression analyses do not. The normality assumption is necessary to unbiasedly estimate standard errors, and hence confidence intervals and P-values. However, in large sample sizes (e.g., where the number of observations per variable is >10) violations of this normality assumption often do not noticeably impact results. Contrary to this, assumptions on, the parametric model, absence of extreme observations, homoscedasticity, and independency of the errors, remain influential even in large sample size settings. / Conclusion: Given that modern healthcare research typically includes thousands of subjects focusing on the normality assumption is often unnecessary, does not guarantee valid results, and worse may bias estimates due to the practice of outcome transformations.

Type:	Article
Title:	Linear regression and the normality assumption
Open access status:	An open access version is available from UCL Discovery
DOI:	10.1016/j.jclinepi.2017.12.006
Publisher version:	https://doi.org/10.1016/j.jclinepi.2017.12.006
Language:	English
Additional information:	This version is the author accepted manuscript. For information on re-use, please refer to the publisher’s terms and conditions.
Keywords:	Epidemiological methods, Bias, Linear regression, Modeling assumptions, Statistical inference, Big data
UCL classification:	UCL UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > Institute of Cardiovascular Science UCL > Provost and Vice Provost Offices > School of Life and Medical Sciences > Faculty of Population Health Sciences > Institute of Cardiovascular Science > Population Science and Experimental Medicine
URI:	https://discovery.ucl.ac.uk/id/eprint/10070182

Downloads since deposit

16,509Downloads

Download activity - last month

Download activity - last 12 months

Downloads by country - last 12 months

Archive Staff Only

View Item