Wallis, SA;
(2013)
z-squared: The origin and application of χ².
Journal of Quantitative Linguistics
, 20
(4)
350 - 378.
10.1080/09296174.2013.830554.
PDF
10.1080-09296174.2013.830554.pdf Download (436kB) |
Abstract
A set of statistical tests termed contingency tests, of which χ2 is the most well-known example, are commonly employed in linguistics research. Contingency tests compare discrete distributions, that is, data divided into two or more alternative categories, such as alternative linguistic choices of a speaker or different experimental conditions. These tests are highly ubiquitous, and are part of every linguistics researcher’s arsenal. However, the mathematical underpinnings of these tests are rarely discussed in the literature in an approachable way, with the result that many researchers may apply tests inappropriately, fail to see the possibility of testing particular questions, or draw unsound conclusions. Contingency tests are also closely related to the construction of confidence intervals, which are highly useful and revealing methods for plotting the certainty of experimental observations. This paper is organized in the following way. The foundations of the simplest type of χ2 test, the 2 × 1 goodness of fit test, is introduced and related to the z test for a single observed proportion p and the Wilson score confidence interval about p. We then show how the 2 × 2 test for independence (homogeneity) is derived from two observations p 1 and p 2 and explain when each test should be used. We also briefly introduce the Newcombe-Wilson test, which ideally should be used in preference to the χ test for observations drawn from two independent populations (such as two sub-corpora). We then turn to tests for larger tables, generally termed r × c tests, which have multiple degrees of freedom and therefore may encompass multiple trends, and discuss strategies for their analysis. Finally, we turn briefly to the question of differentiating test results. We introduce the concept of effect size (also termed “measures of association”) and finally explain how we may perform statistical separability tests to distinguish between two sets of results
Type: | Article |
---|---|
Title: | z-squared: The origin and application of χ². |
Open access status: | An open access version is available from UCL Discovery |
DOI: | 10.1080/09296174.2013.830554 |
Publisher version: | http://www.tandfonline.com/doi/full/10.1080/092961... |
Additional information: | © 2013 The Author(s). Published by Taylor & Francis This is an Open Access article distributed under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/3.0/, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The moral rights of the named author(s) have been asserted. |
UCL classification: | UCL UCL > Provost and Vice Provost Offices > UCL SLASH UCL > Provost and Vice Provost Offices > UCL SLASH > Faculty of Arts and Humanities UCL > Provost and Vice Provost Offices > UCL SLASH > Faculty of Arts and Humanities > Dept of English Lang and Literature |
URI: | https://discovery.ucl.ac.uk/id/eprint/1414352 |
Archive Staff Only
View Item |