UCL Discovery

## Binomial Confidence Intervals and Contingency Tests: Mathematical Fundamentals and the Evaluation of Alternative Methods

Wallis, S; (2013) Binomial Confidence Intervals and Contingency Tests: Mathematical Fundamentals and the Evaluation of Alternative Methods. Journal of Quantitative Linguistics , 20 (3) 178 - 208. 10.1080/09296174.2013.799918.

## Abstract

Many statistical methods rely on an underlying mathematical model of probability based on a simple approximation, one that is simultaneously well-known and yet frequently misunderstood. The Normal approximation to the Binomial distribution underpins a range of statistical tests and methods, including the calculation of accurate confidence intervals, performing goodness of fit and contingency tests, line- and model-fitting, and computational methods based upon these. A common mistake is in assuming that, since the probable distribution of error about the “true value” in the population is approximately Normally distributed, the same can be said for the error about an observation. This paper is divided into two parts: fundamentals and evaluation. First, we examine the estimation of confidence intervals using three initial approaches: the “Wald” (Normal) interval, the Wilson score interval and the “exact” Clopper-Pearson Binomial interval. Whereas the first two can be calculated directly from formulae, the Binomial interval must be approximated towards by computational search, and is computationally expensive. However this interval provides the most precise significance test, and therefore will form the baseline for our later evaluations. We also consider two further refinements: employing log-likelihood in intervals (also requiring search) and the effect of adding a continuity correction. Second, we evaluate each approach in three test paradigms. These are the single proportion interval or 2 × 1 goodness of fit test, and two variations on the common 2 × 2 contingency test. We evaluate the performance of each approach by a “practitioner strategy”. Since standard advice is to fall back to “exact” Binomial tests in conditions when approximations are expected to fail, we report the proportion of instances where one test obtains a significant result when the equivalent exact test does not, and vice versa, across an exhaustive set of possible values. We demonstrate that optimal methods are based on continuity-corrected versions of the Wilson interval or Yates’ test, and that commonly-held beliefs about weaknesses of tests are misleading. Log-likelihood, often proposed as an improvement on , performs disappointingly. Finally we note that at this level of precision we may distinguish two types of 2 2 test according to whether the independent variable partitions data into independent populations, and we make practical recommendations for their use.

Type: Article Binomial Confidence Intervals and Contingency Tests: Mathematical Fundamentals and the Evaluation of Alternative Methods An open access version is available from UCL Discovery 10.1080/09296174.2013.799918 © 2013 The Author(s). Published by Taylor & FrancisThis is an Open Access article distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0/, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The moral rights of the named author(s) have been asserted. UCLUCL > Provost and Vice Provost Offices > UCL SLASHUCL > Provost and Vice Provost Offices > UCL SLASH > Faculty of Arts and HumanitiesUCL > Provost and Vice Provost Offices > UCL SLASH > Faculty of Arts and Humanities > Dept of English Lang and Literature https://discovery.ucl.ac.uk/id/eprint/1414353