The prognostic significance of early blood neurofilament light chain concentration and magnetic resonance imaging variables in relapse‐onset multiple sclerosis

Abstract Background Improved prognostication remains vital in multiple sclerosis to inform personalized treatment approaches. Blood neurofilament light (bNfL) is a promising prognostic biomarker, but to what extent it provides additional information, independent of established MRI metrics, is yet to be established. Methods We obtained all available bNfL data for 133 patients from a longitudinal observational cohort study. Patients were dichotomized into good or poor outcome groups based upon clinical and cognitive assessments performed 15 years after a clinically isolated syndrome. We performed longitudinal modeling of early NfL and MRI variables to examine differences between outcome groups. Results The bNfL dataset was incomplete, with one to three (mean 1.5) samples available per participant. Within 3 months of onset, bNfL was similar between groups. The bNfL concentration subsequently decreased in those with a good outcome, and remained persistently elevated in those with a poor outcome. By year 5, NfL in the poor outcome group was approximately double that of those with a good outcome (14.58 [10.40–18.77] vs. 7.71 [6.39–9.04] pg/ml, respectively). Differences were reduced after adjustment for longitudinal changes in T2LV, but trends persisted for a greater rate of increase in NfL in those with a poor outcome, independent of T2LV. Conclusions This analysis requires replication in cohorts with more complete bNfL datasets, but suggests that persistently elevated blood NfL may be more common in patients with a poor long‐term outcome. Persistent elevation of blood NfL may provide additional prognostic information not wholly accounted for by standard monitoring techniques.


INTRODUCTION
The prognosis of multiple sclerosis (MS) is highly variable, with important implications for the management of patient expectations and clinical decision making around disease-modifying therapies (DMTs).
While randomized controlled trial data on whether all patients with relapsing remitting MS (RRMS) should be offered high-efficacy DMT first line is awaited, most clinicians use demographic, clinical, and MRI variables to personalize treatment plans (Ontaneda et al., 2019). Data from historic, largely untreated cohort studies remain vital to this process (Brownlee et al., 2019, Tintore et al., 2015. Early studies into patients with relapse-onset MS identified clinical features associated with poor long-term prognosis (Scalfari et al., 2014, Eriksson & Andersen, 2003, Confavreux et al., 2003. The inclusion of longitudinal MRI variables subsequently expanded our ability to predict disability outcomes, with evidence of ongoing lesion accumulation, particularly when located in clinically eloquent sites, being of central importance (Brownlee et al., 2019, Tintore et al., 2015, O'Riordan et al., 1998, Brex et al., 2002, Tintoré et al., 2006, Fisniku et al., 2008, Swanton et al., 2009, Di Filippo et al., 2010, Tintore & Castillo, 2010. The  (Brownlee et al., 2019, Fisniku et al., 2008 Until recently, fluid biomarkers of prognosis were limited to cerebrospinal fluid (CSF) oligoclonal bands (Tintore et al., 2015, Dobson et al., 2013. Our ability to quantify neurofilament light chain (NfL), as a fluid biomarker of neuroaxonal injury in either CSF or blood, has led to numerous studies assessing the prognostic significance of CSF NfL (cNfL) or blood NfL (bNfL).
Higher baseline cNfL appears to be associated with current active or chronic neuroinflammation, future inflammatory disease activity, and worse long-term disability outcomes (Ferreira-Atuesta et al., 2021, Maggi et al., 2021, Bhan et al., 2018, Modvig et al., 2014, Salzer et al., 2010, Ferraro et al., 2016. In one mixed cohort of patients with relapsing remitting (RR) or progressive MS (PMS), a 1000 pg/ml increase in baseline cNfL was associated with a subsequent EDSS increase of 0.47 [0.25-0.69] points over the next 5 years (Bhan et al., 2018). This has been replicated in some cohorts for bNfL, though the comparative ease with which bNfL can be repeatedly sampled has facilitated demonstrations that persistently elevated bNfL may have more prognostic significance than baseline measures alone , Calabresi et al., 2018, Cantó et al., 2019, Friedova et al., 2020. In one large mixed cohort, patients with subsequent EDSS worsening over 10 years were more likely to experience increases in bNfL compared with those with a stable EDSS (worsening EDSS: 1.017 pg/ml/year increase; stable EDSS: 1.002 pg/ml/year increase, p < .001), while baseline bNfL levels were similar (21.8 vs. 21.3 pg/ml, p = .69) (Cantó et al., 2019).
A weakness of the existing literature is that when assessing the prognostic significance of bNfL, few studies take into account established MRI prognostic variables already available in clinical practice, such as T2 lesion load, location and activity. Those that do include MRI covariates often only include baseline variables, when longitudinal changes are known to be more informative . It is yet to be established whether longitudinal measures of bNfL add independently significant prognostic information to patients following their first demyelinating event, or whether it is merely reinforcing what is established with MRI data. As bNfL approaches use in clinical practice, this will soon become an important question for those making treatment decisions in early RRMS .
Here, we obtained all available bNfL data from an existing prospective, longitudinal, observational cohort of patients with relapse-onset MS and 15 years of follow-up. We modeled bNfL together with lesional and volumetric MRI variables over the first 5 years from clinical onset, based upon a long-term clinical outcome assessed 15 years after disease onset, to investigate the relationship of bNfL and MRI with disease course in the longer term.

Participants
Participants were prospectively recruited as previously reported (Brownlee et al., 2019

MRI acquisition and analysis
From baseline to 5 years, all participants underwent the same MRI protocol on the same 1.5T Signa scanner, as previously described (Brownlee et al., 2019). Briefly, axial proton-density (PD)/T2-weighted and post-contrast T1-weighted fast-spin echo scans of the brain were acquired. Spinal cord MRI included sagittal T2-weighted and postcontrast T1-weighted scans of the whole spine and a volume acquired inversion prepared fast spoiled gradient echo scan of the cervical cord. fast-spin echo scans were used for the volumetric brain measures.
Using the baseline MRI scan, the normalized brain volume was calculated using SIENAX and follow-up scans were registered with the baseline MRI scan to calculate the percentage brain volume change (PBVC) over time using SIENA (Smith et al., 2002). Upper cervical cord area was calculated as previously described using a active surface

NfL data
Blood samples were not systematically collected as part of the original study protocol, but 133 of the 166 participants with longitudinal follow-up had at least one plasma or serum samples available for analysis. A mean of 1.5 samples (range 1-3) were available for each participant in the study, with sample availability well balanced between those defined as having a good or poor outcome (Table 1). Most highlighted the benefits of analyzing NfL as age-and body mass index (BMI)-adjusted Z-scores (Benkert et al., 2022). Since access to the required control and BMI data were not available for this historical cohort, we analyzed NfL without adjustment, but included age as a covariate in all multivariable analyses.

Categorical definition of long-term clinical outcome
To more fully capture the multifaceted aspects of long-term disability prioritized by patients with MS, we defined patients as having a poor long-term outcome if at their final assessment, any one or more of the following features were present: diagnosis of SPMS; EDSS > = 3.0; cognitive impairment defined by PASAT3 or SDMT Z-score < -1.5; motor impairment defined by 25FW or 9HPT Z-score < -1.5. Z-scores were calculated based upon age-matched normative data.

2.4.2
Longitudinal modeling of early NfL or imaging predictors between outcome groups A statistical analysis plan was formulated prior to undertaking longitudinal modeling. Linear mixed models were used in order to include all patients with at least one bNfL sample in the analysis, retaining the timepoint since clinical onset at which the bNfL sample was taken. For the prespecified primary analysis, the dependent variable was bNfL from baseline to 5 years from clinical onset. The independent fixed effect variable was the categorical definition of good or poor long-term outcome. An interaction between outcome and time (categorically defined) was included. A random effect was included at the level of the participant. Marginal means and their 95% confidence interval were then calculated from this model to produce estimates for the bNfL concentration at each timepoint in the good and poor outcome groups from a single model. A separate model was also constructed with time as a continuous variable, to produce estimates of the difference in the overall rate of change in bNfL from baseline across all timepoints between the two outcome groups. Age at clinical onset, and its interaction with time, was additionally included as a fixed effect covariate in all analyses due to its established impact upon bNfL (Benkert et al., 2022).
The prespecified secondary analysis was to repeat the above model-

2.4.3
Ethical approval and consent

Data availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.

RESULTS
Descriptive statistics for the 133 participants with available bNfL data are shown in

Cross-sectional comparisons between bNfL, MRI, and clinical variables
As shown in Figure 1, the expected cross-sectional relationships were apparent between bNfL and MRI/clinical variables when viewed across all timepoints.

Longitudinal changes in bNfL between good and poor outcome groups
Modeling of bNfL between outcome groups is summarized in Table 2 and Figure 2. At baseline (within 3 months of clinical onset), the differences in bNfL between outcome groups were small and not statistically significant. Patients with a good clinical outcome, however, tended to demonstrate a reduction in bNfL after the first demyelinating event, which remained persistently low throughout the next 5 years. In contrast, there was a trend for those in the poor outcome group to experience a greater rate of increase in bNfL with time (0.72 [−0.58 to 3.29] pg/ml/year greater, compared with the good outcome group), such that by 5 years after clinical onset, patients with a poor outcome had a significantly higher bNfL, approaching double that seen in those with a good clinical outcome (good outcome: 7.71 [6.39-9.04] pg/ml; poor outcome: 14.58 [10.40-18.77] pg/ml).
The analysis was repeated for the subgroup of patients who were diagnosed with MS during follow-up (97 patients; 47% having a poor outcome). The results were similar to those from the whole cohort (Table S1 and Figure S1). At baseline, NfL was similar between the good and poor outcome groups (11.49 [7.43-15.56

Longitudinal changes in MRI variables between good and poor outcome groups
Modeling of MRI variables between outcome groups is summarized in Table 2

Longitudinal changes in bNfL while adjusting for T2 lesion volume
Current guidance recommends the use of brain MRI only in the radiological monitoring of patients with MS (Wattjes et al., 2015). We therefore focused on longitudinal changes in bNfL while adjusting for brain T2LV, as summarized in Table 2 and Figure 4. After adjusting for longitudinal changes in T2LV, the previously described differences in NfL between the good and poor outcome groups were attenuated.
In particular, the marginal means and 95% confidence intervals at baseline and 1 year became very similar between the two groups, suggesting the variability in bNfL between the two outcome groups may largely be accounted for by changes in T2LV. At 3-and 5-year timepoints, the confidence intervals were wide and overlapping between the two outcome groups. The marginal means, however, particularly at 5 years, were higher in the poor outcome group, and the overall rate of change in bNfL between groups, across all timepoints, again suggested a trend to a greater rate of increase in the poor outcome group (0.63 [−0.30 to 3.39] pg/ml/year greater increase in the poor outcome group). Adjusting for additional MRI covariates did not substantially alter these results.

F I G U R E 3
Early longitudinal MRI modeling by 15-year outcome groups. Modeled longitudinal lesional and volumetric MRI data comparing the marginal means and their 95% confidence intervals between the good and poor outcome groups, estimated from a single model based upon distributions generated from 10,000 bootstrap replications. For both PBVC and percentage upper cervical cord area change, the baseline timepoint acts as the reference, and is set to zero. Age at clinical onset, and its interactions with time, is included as a covariate. T1-GAD+, T1 post-contrast enhancing lesions; PBVC, percentage whole brain volume change; UCCA-PVC, upper cervical cord area percentage change. T2LV, T2 lesion volume; T1LV, T1 lesion volume

DISCUSSION
Our results suggest that bNfL concentrations, within 3 months of disease onset, are similar between those with a good and poor long-term outcome. Those with a good long-term outcome, however, tend to subsequently experience a gradual and persistent reduction in bNfL, while bNfL tends to increase and remain persistently elevated in those with a poor outcome. At 5 years from disease onset, those with a poor outcome have a significantly higher bNfL, estimated to be almost double that of those with a good outcome. Marginal means and 95% confidence intervals for bNfL and MRI variables at each timepoint in the good and poor outcome groups, estimated from a single model based upon distributions generated from 10,000 bootstrap replications. The overall difference in the rate of change of each dependent variable from baseline to 5 years (plus bias-corrected and accelerated 95% confidence intervals) from a separate model with time as a continuous variable is also reported in the final column. For PBVC and UCCA-PC, as these measures are reported as the % change from baseline, no data is reported for the baseline timepoint. bNfL, blood Neurofilament light; NA, Not Applicable; T2LV, T2 lesion volume; T1LV, T1 lesion volume; GAD, Gadolinium; PBVC, percentage whole brain volume change; UCCA-PC, Upper cervical cord area percentage change.
When adjusting for change in T2LV, the differences in bNfL between prognostic groups were reduced. At baseline and 1 year after clinical onset, adjusted bNfL is similar in those with good and poor long-term outcomes. The trend toward a greater rate of increase in bNfL in the poor outcome group, however, persists despite adjustment for T2LV, suggesting that independent of changes in T2LV, a persistently elevated bNfL may be of prognostic importance.
The key limitation of this work is the incomplete availability of bNfL samples. This introduces the potential for bias and reduces our power to detect differences in bNfL between the outcome groups, although importantly, the number of NfL samples was well balance between outcome groups. Our analyses should therefore be viewed as preliminary, should only be interpreted at the group level, and require confirmation in similar prospective cohorts with standardized collection bNfL, Their results may therefore actually be consistent with our own, in that persistent elevations of NfL may be of more prognostic significance than very early NfL levels following CIS onset. Further analysis in CIS cohorts with more complete NfL datasets, however, are required to assess the prognostic significant of early baseline NfL compared to the subsequent rate of change in sNfL.
Our finding of a trend toward a greater rate of increase in bNfL, independent of T2LV, in the poor prognostic group is supported by other recent studies. In a large mixed cohort of pwMS, and in a separate RRMS group, NfL appeared to provide additional prognostic information regarding future inflammatory disease activity, in addition to that determined through the monitoring of clinical and MRI variables (new T2 lesions or enhancing lesions) (Benkert et al., 2022, Uher et al., 2021.
While one should be cautious about applying such group-level data to individual patients, these results suggest that incorporating NfL monitoring into clinical practice may be useful in improving our ability to predict future disease activity.
The majority of patients included in this study never received DMT (78% untreated). While this was in accordance with UK practice at the time, most would now be offered treatment. The prognostic utility of longitudinal bNfL monitoring in clinical practice should also be examined in patients receiving immunomodulatory treatment in order to assess whether failure to normalize bNfL following initiation of treatment is associated with a poor long-term prognosis. Early evidence from randomized controlled trial datasets suggests that persistently elevated NfL following initiation of treatment may indeed be associated with a worse outcome, although in patients with low levels of inflammatory activity on stable high-efficacy DMT, progression appears to occur largely independent of baseline or longitudinal NfL concentrations , Bridel et al., 2021. Future studies should additionally assess whether persistently elevated NfL after treatment initiation should be included as a criteria for treatment escalation.

CONCLUSION
bNfL is similar between long-term prognostic groups within 3 months of a first demyelinating event, with a subsequent persistent elevation of bNfL seen in those with a poor long-term clinical outcome. Such differences are reduced after adjusting for changes in T2LV, but trends toward a greater rate of increase in bNfL persist, independent of T2LV, in those with a poor outcome. The incomplete bNfL dataset means this analysis is likely to be underpowered to detect smaller difference in bNfL between outcome groups, and the results should therefore be viewed as preliminary. Further studies with more complete datasets should look to confirm whether persistently elevated bNfL, the monitoring of which may soon be available in clinical practice, provides additional prognostic information beyond that obtained through the current clinical practice of routine monitoring of brain T2 lesion load.