The importance of definitions in the measurement of long‐term health conditions in childhood. Variations in prevalence of long‐term health conditions in the UK using data from the Millennium Cohort Study, 2004–2015

Abstract Objectives To explore the impact of various measurements of long‐term health conditions (LTCs) on the resulting prevalence estimates using data from a nationally representative dataset. Methods Children and young people in the Millennium Cohort Study were followed at ages 3, 5, 7, 11, and 14 years (N = 15,631). We estimated the weighted prevalence of LTCs at each time point and examined the degree to which estimates agreed with alternate health indicators (special educational needs and disability [SEND], specific chronic conditions, and common chronicity criteria) using descriptive analyses, Cohen's kappa statistic, and percentage agreement. Results The estimated weighted prevalence of LTCs peaked at 5 years old (20%). Despite high percentage agreement, we observed at best moderate chance‐corrected agreement between the type of LTC and reasons for SEND (kappas from 0.02 to 0.56, percentage agreement from 97% to 99%) or specified chronic conditions (kappas from 0.002 to 0.02, percentage agreement from 73% to 97%). Applying chronicity criteria decreased the estimated weighted prevalence of LTCs (3%). Conclusion How long‐term conditions are defined drastically alters the estimated weighted prevalence of LTCs. Improved clarity and consistency in the definition and measurement of LTCs is urgently needed to underpin policy and commissioning of services.


| INTRODUCTION
Estimating the prevalence of long-term health conditions (LTCs) in children and young people (CYP) is essential for rational health care provision and commissioning, as well as the elaboration of health policies. However, definitions of LTCs are notoriously imprecise (van der Lee et al., 2007), which limits the opportunity to capture valid and reliable prevalence estimates. A quote frequently attributed to Socrates, which states 'The beginning of wisdom is the definition of terms', emphasises the importance of accurately describing who we plan to include as having a LTC in our studies.
Obviously, the type and severity of conditions that are included impact prevalence estimates, yet due to the considerable heterogeneity in study definitions, the reported prevalence estimates for longstanding health conditions in children vary widely across studies (0.2%-44%) (van der Lee et al., 2007). Most of the definitions identified in the literature are based on a combination of chronicity criteria, including the duration of symptoms and their consequences in terms of a) functional limitations and b) health care requirements (Addor et al., 1997;Feudtner et al., 2000;McPherson et al., 1998;Perrin et al., 1993;Pless & Douglas, 1971;Pless et al., 2010;Stein, 2011;Stein et al., 1993;Westbom & Kornfält, 1987). These criteria have been acknowledged by Mokkink et al. when reporting their consensus-based definition of chronic health conditions in childhood (Mokkink et al., 2008). Broader definitions consist of a measure of duration only (Knottnerus et al., 1992;Newacheck & Stoddard, 1994), which also vary between some of the most frequently cited definitions (van der Lee et al., 2007). For example, some studies report a minimum duration of 3 months in order for a condition to be classified as long-term (Perrin et al., 1993;Pless & Douglas, 1971), while others indicate a minimum duration of 12 months .
Some studies focussing on the measurement of chronic health conditions in childhood use lists (e.g., the International Classification of Diseases) of specific chronic conditions due to their expected persistence or recurrent nature. Examples include asthma, eczema, hayfever, Attention Deficit Hyperactivity Disorder (ADHD), and Autism Spectrum Disorder (ASD) (Barlow & Ellard, 2006;McAleer et al., 2012;Neff et al., 2002;Newacheck & Stoddard, 1994;Wolraich et al., 2019).
Other studies of LTCs include functional impairment as well as, or instead of a duration criterion (Cadman et al., 1986;Farooqi et al., 2006;Perrin et al., 1993;Stein et al., 2000), which may ultimately have a negative impact at school including increased absenteeism, grade repetition, and lower levels of educational attainment (McKinley Yoder & Cantrell, 2019). Indeed, the presence of a chronic illness may be associated with special education provision (McClanahan & Weismuller, 2015;McKinley Yoder & Cantrell, 2019). The UK describes four major reasons for special educational needs and disability (SEND) provision: problems with cognition and learning, communication and interaction (including ASD), sensory or physical needs, and socio-emotional or mental health needs (Department of Education, 2020). Given the scarcity of resources, SEND identification or provision for a health-related reason reflects not only the severity but also the chronicity of a condition and therefore information on the presence of SEND may provide useful data on the prevalence of certain LTCs in childhood in the absence of other direct health data. While consensus and consistency in the field is clearly key, we aimed to illustrate the impact of various measurements of LTCs on the resulting prevalence estimates using data from a large UK-representative birth cohort study.

| Participants
The current study analysed data from the Millennium Cohort Study (MCS), a birth cohort of individuals born across England, Scotland, Wales, and Northern Ireland at the start of this millennium. MCS is a multi-purpose study designed to explore the circumstances, growth, and development of CYP in the UK. The sample was constructed to be representative of the total UK population. In brief, the sample frame was defined as all living children born between September 2000 and 2002, resident in the UK at nine months old, and eligible to receive Child Benefit. At this time, the latter was a universal benefit so covered all children except those whose residency status was uncertain or temporary such as children of asylum seekers and members of foreign armed forces. The sample was drawn from electoral wards across the UK, clustered geographically and disproportionately stratified to overrepresent areas with high proportions of (typically difficult to reach) ethnic minorities in England, residents of wards of increased rates of child poverty across the UK, and residents of Scotland, Wales, and Northern Ireland to ensure that these populations are adequately represented . The baseline sample includes families from across the different ethnic groups and the socio-economic distribution (Connelly & Platt, 2014). Response rate was 96% (at Sweep 1), 81% (at Sweep 2), 79% (at Sweep 3), 72% (at Sweep 4), 69% (at Sweep 5), and 61% (at Sweep 6) (Joshi & Fitzsimons, 2016). In all UK countries, non-response rates (whether through non-contact or refusal) in each sweep have been consistently higher for participants in ethnic or disadvantaged areas as compared to families in advantaged areas (Ketende, 2010). It has been shown that attrition bias was more likely than initial response bias (Plewis, 2007) and one suggested solution for researchers is to supplement the survey design weights with attrition weights (Mostafa, 2013). Full details of the study design and data collection are reported elsewhere (Dex & Joshi, 2005;Hansen, 2010;Plewis et al., 2007).
The MCS was approved by the National Health System Research Ethics Committees (Shepherd & Gilbert, 2019). For this work, analysis of the openly available MCS data was undertaken and therefore specific ethics approval was not required. Data were accessed via the UK Data Service (https://ukdataservice.ac.uk/, reference numbers 5350, 5795, 6411, 7464, 8156).
In the present study, we estimated the weighted prevalence of any LTCs in CYP over time and tested agreement between different reports. We used data collected from age 3 (when data on any LTCs was first collected) to age 14.
All participants with complete information on the study variables at the time of assessment were included in analyses; 15,631 children had complete data on any LTCs at age 3 but 25% were lost to follow-up or had missing follow-up data by age 14.

| Measures
Data was provided by the main parent/carer, who was the biological mother in 96% of the cases. Information was collected during a faceto-face interview conducted by a trained interviewer at participants' homes. There were different variables available across the different study sweeps as illustrated in Table 1 and described below.

| Any LTCs
Information on the presence of any LTCs (yes/no) was obtained from age 3 to age 14.

| Type of LTCs
More detailed data on the type of LTCs were only available at ages 11 and 14. If the parent answered 'yes' to whether the child has any LTCs they were then asked 'Does this (any of these) condition(s) or illness(es) affect [Cohort member's name] in any of the following areas?'.
This was a nominal question with an open-answer element. Response categories were merged with free text-box answers to derive condition categories. We used the following response categories: sight problems (yes/no), hearing problems (yes/no), ADHD (yes/no), ASD (yes/no), mental illness (yes/no), dyslexia (yes/no), speech/language/ communication problems (yes/no). Merging was carried out by the survey agency using the 10 th revision of the International Classification of Diseases. Fifty-one categories were recorded at age 11 and 28 categories were recorded at age 14 (see Supporting Information S1: Appendix A and Appendix B).

| Statistical analysis
We estimated the population prevalence of LTCs by accounting for the MCS survey design and attrition/non-response sampling weights . Differences between CYP who were lost to follow-up or had missing follow-up data and CYP retained in the study up to age 14 were tested using chi-square tests. All analyses were carried out using the Statistical Package for Social Sciences (version 27).

| RESULTS
Using the single question of any LTCs available across all sweeps, the estimated weighted prevalence of any LTCs in CYP aged 3 years old was 16%, peaked at age 5 (20%) before decreasing at following ages (19% at age 7, 14% at age 11, 18% at age 14; see Table 1 and Figure 1). The estimated weighted prevalence of SEND increased from age 7 (9%) to age 11 (12%) and slightly decreased at age 14 (11%) (see Table 2 and Figure 1); as Table 2 illustrates, not all CYP with SEND also had a LTC and vice versa.

| Agreement between type of LTCs and reasons for SEND
The proportion of reports that agreed ranged from 97% to 99%, but as Table 3 shows, there were low levels of chance-corrected agreement between reporting of the type of LTCs and the reasons for SEND at age 11. These ranged from no better than chance for dyslexia (κ = 0.02, p-value < 0.001) to moderate for ADHD or ASD (κ = 0.56, p-value < 0.001).

| DISCUSSION
Findings from this study revealed that the measurements of LTCs across the sweeps of data collection for the MCS varied, which resulted in significant variation in the estimated prevalence of LTCs.
Descriptive statistics suggested that the number of CYP with any LTCs, as measured using the only question available across all sweeps of the MCS, was higher in childhood than in adolescence, in contrast with previous reports (Newacheck & Kim, 2005). While it is wellestablished that attrition is higher among participants with poorer health (Wolke et al., 2009), our analyses suggested no evidence of differential dropout. Furthermore, the high (>99%) data completeness for this key variable supports the validity of our results. These findings almost certainly reflect the variation in the definition of LTCs as read out to study participants by the interviewer across the Note: Results are weighted.
Abbreviations: LTC, long-term condition; n, number; SEND, special educational needs and disability.
T A B L E 3 Agreement between reporting of reasons for special educational/additional support needs and reporting of type of longstanding condition at age 11 in the Millennium Cohort Study *See Appendix A and Appendix C in Supporting Information S1 for the full list of SEND and LTC categories at age 11. PANAGI ET AL.

SEND (n % yes) LTC (n % yes) kappa value, p value a Percent agreement a (%)
-5 of 10 different study sweeps (see Table 1). For example, a smaller minimum duration was applied to classify a condition as being longstanding at age 3 (at least three months) compared to ages 11 and 14 (at least 12 months). This larger duration, which was applied for the first time We would further argue that health service contact is a poor indicator of need, as it is dependent on many other variables as well as the health needs of the young person (Ford, 2008). In an ideal world, self-report information on LTCs should be supplemented with direct measurements of aspects of the person (e.g., height and weight for objective obesity calculations) and healthcare records.
We also observed an increase in the number of CYP with SEND from age 7 to age 11 followed by a decrease at age 14. This finding is in line with previous official school data which indicate that SEND support reaches a peak at age 10, before decreasing as age increases through secondary ages (Department of Education, 2019). Similarly, the number of participants with both LTC and SEND increased with age, which indicates the anticipated overlap between these two conditions; for many children with SEND, their LTC explains their educational needs. SEND was consistently measured across study sweeps in the MCS (from age 7 and onwards), hence may represent a more valid and reliable measure of LTCs in this cohort. However, one should consider that not all CYP with SEND have a chronic health condition since SEND status is not always illness-related but instead a consequence of cognitive or learning problems (see Supporting Information S1: Appendix C). At the same time, not all CYP with a LTC have SEND. Indeed, SEND requires that the health condition has an impact in school (Black et al., 2019) and therefore this measurement may serve as an indicator of more severe or more pervasive health conditions. Furthermore, some would argue that as academic demands increase, that you would expect more CYP to require additional help at school, and that as argued above, provision does not necessarily equate well to need (Hutchinson, 2021).
We found at best moderate chance-corrected agreement between reports on the type of LTCs and reasons for SEND at age 11, although kappa statistics may be depressed at the extremes of agreement and could thus underestimate the level of agreement between different reports as percentage agreement often exceeded 97% (Spitznagel & Helzer, 1985). More than 3% of young people at age 11 had SEND status because of dyslexia, in agreement with previous official school data which reported that a substantial number of pupils in the UK receive SEND because of learning difficulties (Department of Education, 2020). In contrast, only 0.1% of participants mentioned dyslexia when asked about the presence of LTCs.
While this may relate to the perception that dyslexia is not a health condition, some participants still reported dyslexia when asked about the presence of any physical or mental LTCs. How questions are asked of participants is hugely important and often not sufficiently piloted and tested. Piloting and testing survey questions is a crucial part of the survey design.
We observed low agreement between reports on the type of LTCs and specified chronic conditions at age 14; but importantly, for most of the examined conditions, more participants responded 'yes' to the binary disease-specific question than those who reported having the specific condition in the question about the type of LTC -7 of 10 Sembajwe et al., 2010;Simpson & Sheikh, 2010;To et al., 2012) while the type of LTC question may have underestimated the rates. A previous Canadian study comparing parental report on different childhood LTCs (asthma, bronchitis, and otitis) and health events (birth weight, accidents, immunisations, hospitalisations, health visits) found otitis and health visits to be the only measures that were underreported by the parent as compared to medical data, with all other measures being overreported (Pless & Pless, 1995). Another study conducted in the UK showed that the annual prevalence of self-reported asthma diagnosis and treatment was 9.6% versus 5.7% using clinicians' reports on the same condition and treatment (Mukherjee et al., 2016). Asthma, specifically, is an interesting terventions to reduce disparities in access to services. An agreed terminology will also improve communication between professionals and ensure consistent identification of the patient group which will ultimately improve consistency, continuity, and effectiveness of care across health, education, and social services.
Our study is not without limitations. We have tested only one dataset to illustrate these issues, although we are aware of others with similar methodological considerations (Finning et al. In Submission). Future research should replicate our approach in other data sources. Secondly, the MCS has further data at age 17, which we excluded as LTCs were reported by young people themselves, rather than parents. There is a small literature that indicates poor agreement between informants (Collishaw et al., 2009) and therefore we chose to maintain consistency of informants. However, multiinformant diagnostic assessments may have provided greater accuracy than a single informant provided that there is a clear process for managing disagreement (Garb, 2005). Within-sweep comparisons of reporting of health conditions were examined for ages 11 and 14 only, as data were not available to us from earlier sweeps. The measurement of SEND at ages 7, 11, and 14 and the measurement of ADHD and ASD using the two binary questions at age 14 may reflect lifelong prevalence ('ever had') and therefore may not be directly comparable with alternate reports included in the study. Data linkage was out of the scope of this study and ethics approval for data linkage was not sought, therefore no efforts were made to link data from this cohort with health record data. Future studies should further explore differences in prevalence estimates by comparing survey data with medical records.
Concluding, this is the first study to explore variation in the measurement of LTCs in childhood within a large, nationally representative dataset. While it is unlikely that there will be a single definition that suits all stakeholders for all purposes, improved clarity about the key dimensions and greater transparency about what is or is not included, and why, for particular purposes is desperately needed. Importantly, future longitudinal studies need to maintain a consistent system for assessing LTCs in order to ensure that prevalence estimates are comparable over time.

ACKNOWLEDGMENTS
We thank the Beryl Alexander Charity for funding this study.

CONFLICTS OF INTEREST
The author declares that there is no conflict of interest that could be perceived as prejudicing the impartiality of the research reported.

DATA AVAILABILITY STATEMENT
The data that support the findings of this study are openly available via the UK Data Service at https://ukdataservice.ac.uk/ (reference numbers: 350, 5795, 6411, 7464, 8156).