Trajectories of physical and mental functioning over 25 years before onset of frailty: results from the Whitehall II cohort study

Abstract Background Research on frailty, a major contributor to heterogeneity in health, is undertaken on older adults although the processes leading to frailty are likely to begin earlier in the life course. Using repeat data spanning 25 years, we examined changes in physical and mental functioning before the onset of frailty, defined using Fried's frailty phenotype (FFP). Methods Functioning was measured using the Short‐Form 36 General Health Survey (SF‐36) on nine occasions from 1991 (age range 40–63 years) to 2015 (age range 63–85 years). The poorest of four FFP scores from 2002, 2007, 2012 and 2015 was used to classify participants as frail, pre‐frail, or robust. We used linear mixed models with a backward timescale such that time 0 was the person‐specific date of frailty classification for frail and pre‐frail participants and the end of follow‐up for robust participants. Analyses adjusted for socio‐demographic factors, health behaviours, body mass index and multi‐morbidity status were used to compare SF‐36 physical (PCS) and mental (MCS) component summary scores over 25 years before time 0 as a function of FFP classification, with estimates extracted at time 0, −5, −10, −15, −20 and −25 years. We also used illness–death models to examine the prospective association between SF‐36 component summary scores at age 50 and incident FFP‐defined frailty. Results Among 7044 participants of the Whitehall II cohort study included in the analysis [29% female, mean age 49.7 (SD = 6.0) at baseline in 1991], 2055 (29%) participants remained robust, and 4476 (64%) became pre‐frail and 513 (7%) frail during follow‐up. Frail compared with robust participants had lower SF‐36 scores at t = −25 before onset of frailty with a difference of 3.4 [95% confidence interval (CI) 1.6, 5.1] in PCS and 1.8 (−0.2, 3.8) in MCS. At t = 0, the differences increased to 11.5 (10.5, 12.5) and 9.1 (8.0, 10.2), respectively. The differences in SF‐36 between the robust and pre‐frail groups, although smaller [at t = 0, 1.7 (1.2, 2.2) in PCS and 4.0 (3.4, 4.5) in MCS], were already observed 20 and 25 years, respectively, before the onset of pre‐frailty. Prospective analyses showed that at age 50, scores in the bottom quartiles of PCS [hazard ratio (HR) compared with the top quartile = 2.39, 95% CI 1.85, 3.07] and MCS [HR = 1.49 (1.15, 1.93)] were associated with a higher risk of FFP‐defined frailty at older ages. Conclusions Differences in trajectories of physical and mental functioning in individuals who developed physical frailty at older ages were observable 25 years before onset of FFP‐defined frailty. These findings highlight the need for a life course approach in efforts to prevent frailty.


Introduction
The concept of frailty, defined as a state of increased vulnerability to stressors, emerged in the gerontology literature to explain clinical heterogeneity in health of older adults. 1 Current clinical practice guidelines recommend screening for frailty in all adults 65 years and older in the general population, 2 with the World Health Organization advocating active case finding and reorientation of health services in individuals with frailty. 3 Tools to measure frailty have been developed and used mostly on older adults, often older than 75 years. These include the Fried's frailty phenotype (FFP), 4 which has also been used in intervention studies on frailty. [5][6][7] Although the prevalence of frailty increases steadily with age, the processes underlying frailty are likely to begin well before old age. There is emerging evidence of frailty in middle-aged adults, [8][9][10] and the importance of a life course approach to frailty is increasingly recognized. 11 Despite considerable research on frailty in recent years, little is known about the changes in physical and mental function leading to this syndrome. 11,12 Better understanding of these changes is likely to provide insight into optimal timing of screening and targeted therapeutic interventions and early prevention.
The objective of the present study was to examine whether deficits in physical and mental functioning are present before the onset of frailty, defined using FFP. Using data from the Whitehall II cohort study, we compared 25-year trajectories of physical and mental functioning before the onset of FFP frailty. In complementary analyses, we also used prospective analyses to examine whether poor functioning at age 50 was associated with risk of FFP frailty at older ages.

Study population and design
The Whitehall II study is an ongoing prospective cohort study of 10 308 British civil servants, 6895 men and 3413 women, aged 35-55 in 1985-1988. 13 Since baseline, follow-up clinical examinations have taken place approximately every 4-5 years using home-based assessment for those who choose this option and clinic-based assessments (London and major cities in the UK) for others; each wave has taken approximately 2 years to complete. In addition to clinical examinations, data over the follow-up have been obtained via questionnaire surveys and linkage to electronic health records of the UK National Health Service (NHS). The NHS provides most of the healthcare in the country, including inpatient and outpatient care, and record linkage is undertaken using a unique NHS identifier held by all UK residents. At each wave, participants provided informed written consent and research ethics approval was obtained from the NHS London-Harrow Research Ethics Committee (latest reference number 85/0938).  4 Because weight was measured every 5 years, we used a cut-off of 10% of loss on body weight as used in the Women's Health Aging Study I. 15 4. Low physical activity was denoted by an energy expenditure of <383 kcal/week for men and <270 kcal/week for women, assessed based on responses to a questionnaire on frequency and duration of participation in 20 physical activities (e.g. cycling, housework and gardening activities). A metabolic equivalent value was assigned to each activity to calculate the energy expenditure of each participant. 5. Exhaustion was defined based on responses to two itemsTrajectories of functioning before onset of frailty extracted from the Center for Epidemiology Studies Depression (CES-D) scale: 'I felt that everything I did was an effort in the last week' and 'I could not get going in the last week'. If participants answered 'occasionally or moderate amount of the time (3-4 days)' or 'most or all of the time (5-7 days)' to either of these items, they were categorized as exhausted. At each of the four waves between 2002 and 2015, the FFP score was calculated as the number of components meeting the criteria described above, resulting in a score ranging from 0 to 5. The poorest performance recorded during this period was used to attribute FFP status to each participant as frail if their score was 3 or more, pre-frail for a score from 1 to 2 and robust for those with no impaired criteria. Participants classified as pre-frail and frail were censored at the corresponding date of their worst FFP status and robust participants at last participation (corresponding to their last clinical examination  between 2002 and 2015).

Short-Form 36 General Health Survey
The Short-Form 36 General Health Survey (SF-36) was administered at nine data collection waves (1991,1995,1997,2001,2002,2006,2007,2012,2015). 16,17 The SF-36 was designed to be a measure of general health status and health-related quality of life. It contains 36 questions, which consist of eight subscales covering the following domains: physical functioning, bodily pain general health, physical role functioning, vitality, emotional functioning, social role functioning and general mental health. Responses to each question within a dimension were combined to generate eight scores from 0 to 100, with higher scores indicating better health. The SF-36 was also summarized into physical and mental components scores (PCS and MCS) to measure physical and mental functioning. All subscales contribute in varying proportions to PCS and MCS; these scores range from 0 to 100 and are constructed such that mean scores in the population are 50.

Covariates
Socio-demographic variables included age, sex, ethnicity (White or non-White), current marital status (living with a partner or single) and occupational position at age 50 (high, intermediate and low, reflecting income and status at work). 13 Health behaviours included smoking status (never smoker, ex-smoker, current smoker), alcohol consumption (no alcohol in the previous week; moderate, 1-14 units/week; high, >14 units/week), physical activity (less than or at least the recommended 150 min per week of moderate-to-vigorous physical activity) and frequency of fruits and vegetables consumption (less than daily, at least once daily).
Chronic conditions were ascertained from clinical examinations in the study and linkage to electronic health records. Three national databases were used: the national Hospital Episode Statistics (HES) database with inpatient and outpatient data; the Mental Health Services Data Set, which in addition to inpatient and outpatient data also has data on care in the community; and the cancer registry. Chronic conditions considered were diabetes (fasting glucose ≥7.0 mmol/L, reported doctor-diagnosed diabetes, use of diabetes medication, ICD10: E10-E14), coronary heart disease (12-lead resting ECG recording, ICD10: I20-I25), stroke (MONICA-Augsburg stroke questionnaire, ICD10: I60-I64), cancer (cancer registry with malignant cancer ICD10: C00;C97), dementia (ICD10: F00-F03, F05·1, G30, G31), Parkinson's disease (self-report of longstanding illness, ICD10: G20), chronic obstructive pulmonary disease (self-report of long-standing illness, ICD10: J41-J44), depression (self-report of long-standing illness, use of antidepressants, ICD10: F32-F33) and arthritis (self-report of long-standing illness, ICD10: M05, M06, M15-M19). Multi-morbidity status was defined as the presence of two or more chronic conditions and was categorized as 0, 1 and 2 or more diseases.

Mortality
Death from any cause was ascertained using mortality records obtained from the British national mortality register (NHS Central Registry) until October 2019. The tracing exercise was carried out using the National Health Service identification number (NHS-ID) of each participant.

Statistical analysis
The association between SF-36 and FFP status was examined using two approaches: (i) comparison of SF-36 trajectories over 25 years as a function of FFP status (robust, pre-frail, frail) of participants and (ii) time-to-event analysis to examine the association between poor SF-36 scores at age 50 and incident frailty, defined using FFP.

Trajectory analysis
We compared SF-36 trajectories (PCS, MCS and the eight subscales, between 1991 and 2015) as a function of FFP status, defined as the poorest score out of four frailty assessments between 2002 and 2015. Trajectories of SF-36 were estimated using linear mixed models with a backward timescale, anchored to the date of frailty classification such that time 0 was the date at which a participant was classified as their worst FFP-defined status as being pre-frail or frail. Data on SF-36 after frailty/pre-frailty classification was discarded as our aim was to compare SF-36 trajectories before the onset of frailty. For participants who remained robust throughout the study, time 0 was the date of clinical examination at last participation. The analysis was adjusted for socio-demographic factors (sex, ethnicity, marital status and occupation position, age at time 0), frailty status, time terms (time, time 2 and time 3 ) and interactions of time terms with socio-demographic factors and with FFP status (Model 1); health behaviours (physical activity, alcohol, tobacco and fruits/vegetable consumptions) (Model 2); and BMI and the multi-morbidity status (Model 3). Besides sex, ethnicity and age at time 0 and FFP status at time 0, data on time varying variables were entered in the analyses concurrent to the measure of SF-36. Random effects for the intercept and time were included to allow inter-individual differences in SF-36 at the intercept (time = 0, at the frailty classification) and in changes in SF-36 over time (in the rate of change in SF-36 over the backward follow-up). Estimates of differences in SF-36 between FFP status were extracted at time 0, À5, À10, À15, À20 and À25 years from FFP classification.

Time-to-event analysis
The prospective analyses were based on dichotomous measures of SF-36 (scores in the worst quartile vs others), retrieved from the wave closest to when participants were 50 (±5) years, to examine associations with FFP-defined frailty (frail participants compared to robust and pre-frail participants), over the follow-up undertaken on participants who were not frail at age 50. These analyses were carried out using an interval-censored illness-death model with a Weibull distribution to extract hazard ratio (HR) of frailty in those with poor SF-36 scores compared with others. This method takes interval-censored nature of the data and competing risk of death into account. Interval censoring was used because measurement of frailty was available only at the waves of data collection and not continuous, such that the exact date of onset could lie in the interval between two clinical examinations. Each SF-36 scale was analysed separately, and all analyses were first adjusted for socio-demographics variables (Model 1), then for health behaviours at age 50 (Model 2) and subsequently also for BMI and multi-morbidity status at age 50 (Model 3).
The analyses were conducted using R software (R Core Team, 2021, version 4.1.2). Linear mixed models, comparisons of SF-36 trajectories between frail/pre-frail/robust groups and illness-death models were performed using the nlme (version 3.1-153), emmeans (version 1.7.2) and SmoothHazard (version 1.4.1) packages, respectively. Estimates were reported with 95% confidence intervals (95% CI) and two-tailed P-values considered significant at 0.05 level.
Additional analysis For the analysis of SF-36 trajectories, we repeated the main analyses using age as an alternative timescale. Linear mixed-models were used to examine SF-36 trajectories between 40 and 85 years according to the worst FFP-defined frailty status. These analyses were adjusted for the same covariates as in the main analyses and age terms (age, age 2 , age 3 ) were used to model the time-scale of the SF-36 trajectories. For the time-to-event approach, we repeated the analysis using continuous SF-36 scores; the risk of frailty was estimated for a 5-point lower SF-36 score.
Between 1991 and 2015, over a mean period of 21.4 [standard deviation (SD) 4.0] years, participants provided a mean of 6.4 (SD 2.0) measures of SF-36; 94% of participants had an SF-36 measure at the 1991 wave. Over the four measures of FFP between 2002 and 2015, 7% (513/7044) of participants were classified as frail, 64% (4476/7044) as pre-frail and 29% (2055/7044) as robust based on their worst FFP score over this period. Among those classified as frail and pre-frail, most (74% for both) participants had not changed their FFP status at the last measurement of FFP (Table S1). Given the low proportion of participants who changed FFP status, and the focus of our analyses being on SF-36 trajectories over 25 years before FFP classification, we did not take these changes into account in the analyses. Table 1 shows that at the 1991 wave, participants who went on to be classified as frail over the follow-up were more likely to be older (53.1 vs 49.6 and 49.2 years old, Pvalue < 0.001), were women (46%, 217/468, vs 30%, 1277/ 4201 and 23%, 445/1928, P-value < 0.001) and were to report at least one chronic disease (10%, 47/468, vs 5%, 218/ 4201 and 3%, 62/1928, P-value < 0.001) and had lower SF-36 scores (P-value < 0.001 for all) compared with pre-frail and robust participants, respectively. Mean age at time 0 (classification of FFP status) was 75.4 (SD 6.3) for frail, 71.8 (SD 6.2) for pre-frail and 70.7 (SD 5.9) for robust participants. Figure 1 shows trajectories of SF-36 component summary scores up to 25 years before onset of pre-frailty, frailty or end of follow-up using a backward timescale. Compared with the robust group, frail participants had the poorest scores, particularly in PCS, with an acceleration of the difference at older ages. Estimates of the difference in these scores, every 5 years over the 25 years preceding time 0 are shown in Table  2. PCS was lower (3.4, 95% CI 1.6; 5.1) before in frail compared with robust participants 25 years FFP classification (t = À25), increasing to a difference of 11.5 (95% CI 10.5; 12.5) at onset of frailty (t = 0). Similar differences were observed between frail and pre-frail participants, at time À25 years (3.1, 95% CI 1.4; 4.8) and increasing at time 0 to 9.8 (95% CI 8.9; 10.7). There were differences in PCS between robust and pre-frail, starting at time À20 years, but they were smaller, for example, the difference at time 0 was 1.7 (95% CI 1.2; 2.2). Table 2 also shows differences in MCS between the robust, pre-frail and frail groups; the pattern of results was similar to PCS, but the differences between these groups were smaller; at time 0, the difference between FFP classification as robust, pre-frail and frail was based on the poorest FFP score using four waves of data between 2002 and 2015; robust corresponds to FFP score of 0, pre-frail to scores of 1 or 2 and frail to scores of 3 or higher.

Analysis of SF-36 trajectories
b The multi-morbidity status is composed of diabetes, chronic heart disease, stroke, cancer, dementia, Parkinson's disease, chronic obstructive pulmonary disease, depression and arthritis.   Trajectories of functioning before onset of frailty the robust and frail was 9.1 (95% CI 8.0; 10.2) and 5.1 (95% CI 4.1; 6.1) between pre-frail and frail participants. Figure 2 shows the scores in SF-36 subscales (physical functioning, bodily pain, general health, physical role functioning, vitality, emotional functioning, social role functioning and general mental health) from À25 years to time 0 in robust, pre-frail and frail participants. The estimated differences between robust and pre-frail (Table S2), between robust and frail (Table S3) and between pre-frail and frail (Table S4) suggest a similar pattern of findings with significant differences observed 25 years before onset of FFP-defined frailty.
Results using age as the timescale are shown in Table S5 and Figures S2 and S3. The largest difference was observed in PCS between robust and frail (21.4,95% CI 18.5;24.2) and between the pre-frail and frail (15.8, 95% CI 13.2; 18.5) groups at age 85. These differences were smaller at younger ages, but PCS scores were higher in robust or pre-frail compared with frail participants at age 45. The pattern of results for MCS was similar, although like the main results, the differences were smaller in size. The same was observed for subscales of the SF-36 ( Figure S3).

Time-to-event analysis
Among the 7044 participants with data on FFP frailty, 5161 (73%) had data on SF-36 at age 50 (± 5) and were included in the time-to-event analyses. Over a mean follow-up of 21.4 (SD 4.0) years, 269 (5%) of these participants were classified as being frail (three or more impaired FFP criteria). At 50 years, participants who became frail had lower scores compared with non-frail participants on all SF-36 subscales and component scores (Table S6). Table 3 shows results of the association between poor (lowest quartile) SF-36 scores (PCS, MCS and the eight subscales) at age 50 and FFP frailty over the follow-up, in analyses that took into account the competing risk of death and interval-censored nature of the data. In fully adjusted analyses, poor scores on PCS (HR = 2.39, 95% CI 1.85; 3.07) and MCS (HR = 1.49, 95% CI 1.15; 1.93) were associated with higher risk of FFP frailty. Results were similar when SF-36 scores at age 50 were considered as continuous measures (Table S7). A 5-point lower score in PCS (HR = 1.30, 95% CI 1.22; 1.37) and MCS (HR = 1.14, 95% CI 1.08;1.22) was associated with an increased risk of FFP-defined frailty.

Discussion
Our study using repeated data from nine measures of SF-36 physical and mental functioning between midlife and old age in individuals who developed FFP-defined frailty presents three key findings. First, analysis of SF-36 trajectories showed that participants who developed frailty at older ages had lower SF-36 scores compared with robust and pre-frail participants 25 years before the onset of frailty. This finding was confirmed in time-to-event analyses where SF-36 scores at age 50 were associated with a higher risk of incident frailty. Our results suggest that frailty at older ages involves changes over a long period, with heterogeneity in functioning already evident in midlife. The extent to which differences in functioning are evident even earlier remains unknown as we did not have data earlier in the life course. Second, differences in groups defined by frailty status were considerably larger for physical than mental component scores, highlighting the importance of deficits in physical function for FFP frailty. Third, there were only small differences in SF-36 trajectories between the robust and pre-frail groups, with considerably larger differences between the pre-frail and frail groups and between the robust and frail groups. Heterogeneity in individual trajectories of measures of health and functioning is a hallmark of ageing. 18,19 The concept of frailty was developed to capture this heterogeneity at older ages in the general population. 1 Accumulation of three or more of the five components of FFP (slow gait speed, weakness, unintentional weight loss, low physical activity and exhaustion) has been hypothesized to capture a state of vulnerability to risk of adverse health outcomes. 1 Our study shows that trajectories of physical and mental functioning, as measured by the SF-36 PCS and MCS, respectively, of individuals who go on to be classified as frail diverged 25 years before the onset of FFP frailty and as early as at age 45 using age as the timescale. Time-to-event analyses also showed SF-36 scores at age 50, 15 years before the recommended age for routine screening of frailty, 2,20 to be associated with higher risk of FFP-defined frailty. These findings suggest that there is urgent need to better understand the changes, starting early in the life course, that lead to physical frailty at older ages.

Strengths and limitations
This study contributes to the emerging literature on determinants and changes leading to FFP frailty. The main strength of the study is modelling the course of changes in SF-36, a validated and standardized tool, 16,17 with repeated measurements over a 25-year period, starting at age 40. The use of Fried's frailty scale, 4 the most widely used frailty measure in the literature, 7 is a further strength. A further strength is also the analytic approach, consisting of analysis of trajectories along with time-to-event analyses to examine robustness of the findings.
The study findings need to be considered in light of some limitations. First, the measure of FFP frailty was elaborated in 2001 and introduced to the study in 2002; hence, the first of four assessments of FFP in our study was when the mean age of participants was 61. Availability of the measure in 1991 concurrent to the SF-36 measure would have allowed analyses of onset of frailty earlier in the life course. Second, classifying participants based on their worst FFP score as robust, pre-frail and frail over four measures does not reflect the dynamics of the frailty process over time. As the focus of our analyses was on changes in functioning, we did not examine changes in the patterns of FFP status using repeated measures. Third, analyses were adjusted for chronic diseases using multi-morbidity, but examination of changes in SF-36 trajectories in groups defined by frailty after the occurrence of an acute health event was beyond the scope of the present study. Fourth, whether SF-36 trajectories as a function of FFP status differ in specific socio-demographic subgroups could not be examined due to small numbers.

Comparison with previous studies
A recent meta-analysis of 22 studies reported poorer quality of life, measured using a range of instruments including the SF-36, in frail compared with pre-frail or robust older adults. 21 Despite methodological heterogeneity, studies that used SF-36 showed larger differences in physical rather than mental functioning. This was also the case in cross-sectional analysis of the association between frailty and SF-36 in the European Male Ageing Study on men between 40 and 79 years. 22 The concept of frailty is thought to reflect loss of biological reserve, a possible explanation for the stronger association of frailty with physical functioning aspects of the SF-36.
Prospective studies have examined risk factors for frailty; these include studies on socio-economic factors, 23,24 health behaviours, 25-27 obesity, 28,29 early life adversities 30,31 and poor self-rated health in midlife. 32 One previous study used group-based modelling on self-rated health with three assessments over 8 years to show persistent poor self-rated health to be associated with higher risk of frailty. 33 The present study adds to the life course approach to frailty using nine measurements of SF-36 over a 25-year period to show diverging SF-36 trajectories as a function of FFP status, both using a backward timescale and age as the timescale. To our knowledge, this is the first study to show robust differences in functioning 25 years before the onset of frailty.

Meaning of findings
The concept of frailty was designed to capture heterogeneity in health of older adults, 1 but whether FFP-defined frailty, as currently measured, is confined to old age is increasingly debated. Landmark studies 9,10 show frailty to be prevalent in middle-aged adults in the general population, and like those with frailty at older ages, these individuals have a higher risk of adverse health outcomes. A recent reflection on the use of frailty for clinical practice and public health recommended that the use of a life course approach would provide insight into the development of frailty. 11 Our findings and previous studies on midlife physical frailty suggest that elaboration of a standard instrument to measure frailty, particularly before age 65, is crucial as current tools were developed for use in older adults. The components of FFP may well be suitable, but the thresholds on each component, defined on an older population in the original study, 4 may not be ideal before the age 65. We show differences in functioning as early as at age 45 between those who developed frailty later in life and those who remained robust. Whether adapted measures of physical frailty would have picked up frailty earlier in the life course remains unclear.
Differences in SF-36 scores were observed at age 45 in our study, but whether these differences are clinically meaningful warrants further research. 34 Studies investigating the minimal clinically important difference in SF-36 suggest heterogeneous results depending on the target population and methodology. 35,36 Overall, significant clinical change in SF-36, at the individual level, appear to be between 5 and 10 points but can range from 2 to 22 points depending on the subscales being considered. In our study, the differences on PCS between robust and frail groups increased from 3.4 at time À25 to 11.5 at time 0. The difference in MCS between these groups was smaller, but there was a fivefold increase between over 25 years. The difference between robust and frail participants in the physical role subscale increased from 6.6 at time À25 to 38.8 at time 0. The additional analyses using age as timescale showed a robust increase in differences in SF-36 with age, but the point at which these differences become clinically important remains unclear.

Conclusions
Our analysis shows poorer physical and mental functioning 25 years before the onset of FFP-defined frailty and as early as age 45 years in those who go on to develop FFP-defined frailty, particularly in physical functioning. These findings highlight the need for frailty screening prior to old age, perhaps using instruments and thresholds of functional impairment that are better suited to middle-aged adults for effective prevention.