Central reader evaluation of MRI scans of the sacroiliac joints from the ASAS classification cohort: discrepancies with local readers and impact on the performance of the ASAS criteria

Objectives The Assessment of SpondyloArthritis international Society (ASAS) MRI working group conducted a multireader exercise on MRI scans from the ASAS classification cohort to assess the spectrum and evolution of lesions in the sacroiliac joint and impact of discrepancies with local readers on numbers of patients classified as axial spondyloarthritis (axSpA). Methods Seven readers assessed baseline scans from 278 cases and 8 readers assessed baseline and follow-up scans from 107 cases. Agreement for detection of MRI lesions between central and local readers was assessed descriptively and by the kappa statistic. We calculated the number of patients classified as axSpA by the ASAS criteria after replacing local detection of active lesions by central readers and replacing local reader radiographic sacroiliitis by central reader structural lesions on MRI. Results Structural lesions, especially erosions, were as frequent as active lesions (≈40%), the majority of patients having both types of lesions. The ASAS definitions for active MRI lesion typical of axSpA and erosion were comparatively discriminatory between axSpA and non-axSpA. Local reader overcall for active MRI lesions was about 30% but this had a minor impact on the number of patients (6.4%) classified as axSpA. Substitution of radiography with MRI structural lesions also had little impact on classification status (1.4%). Conclusion Despite substantial discrepancy between central and local readers in interpretation of both types of MRI lesion, this had a minor impact on the numbers of patients classified as axSpA supporting the robustness of the ASAS criteria for differences in assessment of imaging.


INTRODUCTION
The ASAS classification cohort study (ASAS-CC) recruited patients referred to a rheumatologist with undiagnosed back pain. It led to the ASAS classification criteria in which patients diagnosed with axial spondyloarthritis (axSpA) could be classified as having axSpA by either an imaging or clinical arm 1 . Imaging criteria for sacroiliitis could be either radiographic or the presence of bone marrow edema (BME) as elaborated in the ASAS consensus definition 2,3 . The sensitivity and specificity of the criteria were 83% and 84%, respectively, and follow up after 4.4 years indicated a high positive predictive value for a rheumatologist's diagnosis of axSpA 4 .
The assessment of MRI scans from the ASAS-CC by local readers was limited to determination whether the baseline scan demonstrated active and/or structural lesions typical of axSpA 1 . In the decade since this study our understanding of MRI lesions in the SIJ has increased substantially 5 but longitudinal data has been obtained from cohorts of patients with symptoms restricted to 2-3 years and not the typical patient referred to a rheumatologist where symptom duration averages 8-9 years 6,7 . Moreover, it has been recognized that BME can be observed in the SIJ in other disorders and even in 20-40% of healthy individuals [8][9][10] . This has led to concerns focused on the accuracy of local reader interpretations of imaging findings on MRI in the ASAS-CC and whether respectively. Agreement for detection of MRI lesions between central and local readers was assessed descriptively and using the kappa statistic. We calculated the number of patients that were classified differently after central reader detection of active lesions on MRI replaced local readers and after central reader detection of structural lesions on MRI replaced local reader detection of radiographic sacroiliitis for overall fulfillment of the ASAS criteria and for the imaging arm of the criteria.

Spectrum of MRI lesions at baseline and follow up in the ASAS-CC
In exercise A, 199/275 (72.3%) were diagnosed as having axSpA and 131/170 (77.1%) were diagnosed with axSpA at follow up. For MRI Q1, active lesions typical of axSpA were observed by a majority of readers in 43.2% and 44.3% of cases diagnosed with axSpA at baseline and follow up, respectively, as compared to 3.9% and 5.1% diagnosed without axSpA ( Table 1). The most frequent lesion was subchondral inflammation, which was observed in 51.3% and 13.2% of cases diagnosed with and without axSpA, respectively.
Inflammation at the site of erosion, enthesitis, and joint space fluid were each observed in 5-10% of cases diagnosed as axSpA. The first two lesions were also 100% specific for axSpA. For MRI Q2, structural lesions typical of axSpA were observed in 39.4% and 44.6% of cases diagnosed with axSpA at baseline and follow up, respectively, as compared to 9.7% and 6.5% without axSpA ( Table   1). The most frequent lesion was erosion followed by fat lesion. The frequencies of MRI lesions were similar when individual reader observations were analyzed (supplementary Table 1). Most patients with lesions typical of axSpA had a combination of acute and structural lesions with only 4.6% of cases having only acute lesions and 4.6% having only structural lesions typical of axSpA (supplementary Table 2). There were 13% of cases who had active or structural lesions typical of axSpA by the majority of readers but were diagnosed as not having axSpA at baseline and follow up.
In exercise B, assessment of MRI scans blinded to baseline and follow up time points demonstrated that central reader detection of active lesions typical of axSpA was 100% and 95.2% specific for rheumatologist diagnosis of axSpA, respectively ( Table 2).
Sensitivity for diagnosis of axSpA was 41% at baseline and 28% at follow up. There was a decrease of 9.3% in the proportion of cases from the entire cohort with active inflammatory lesions typical of axSpA (MRI Q1) from baseline to follow up (p=0.05). Subchondral inflammation was observed in 49% of cases diagnosed as axSpA at baseline and 36% at follow up but also in 4.2% and 14.3% of baseline and follow up scans from cases without axSpA. There were 19 (17.8%) of cases that were started on tumor necrosis factor inhibitor (TNFi) therapy during the course of follow up. Of these cases, 57.9% had a reduction in inflammatory lesions compared to 5.7% of cases not receiving anti-TNF therapy (p<0.001).
Structural lesions typical of axSpA (MRI Q2) were observed in 38.2% and 51.2% of baseline and follow up scans of cases diagnosed with axSpA, respectively. For the entire cohort there was a significant increase of 9.4% (p=0.02) in cases with structural lesions from baseline to follow up, and this was comprised of an increased proportion with a fat lesion and ankylosis ( Table 2). Erosion was the structural lesion observed most frequently in axSpA, was more highly discriminatory than any active lesion per follow up diagnostic assessment, and was highly specific, being present in only a single case diagnosed at baseline as non-axSpA, and in no cases diagnosed as non-axSpA at follow up.
In exercise B, MRI was considered indicative of axSpA in 44/108 (40.7%) of cases at baseline and in 43/86 (50.0%) diagnosed as axSpA by the rheumatologist. Change in MRI diagnosis from baseline to follow up assessments was recorded in only 10/108 (9.3%) cases (4 from axSpA to not axSpA, and 6 from not axSpA to axSpA) according to agreement by 2 readers (Table 3). Change in MRI diagnosis was recorded in only 3 cases according to a majority of readers (≥5/8). Change in rheumatologist diagnosis was recorded in 9/108 (8.3%) cases, 2 of which had a change in MRI diagnosis.

Local Versus Central Reader Detection of MRI Lesions in the Sacroiliac Joint
The frequency of active lesions reported by local readers (61%) in cases diagnosed with axSpA was greater than for central readers (43.2% and 49.7% for majority (≥4/7) and 2 reader data, respectively) ( Table 4). This difference was similar for scans limited to cases that attended for follow up evaluation and cases where only data from DICOM scans was analyzed (supplementary Table 3).
Structural lesions typical of axSpA were reported by local readers in 44.4% of cases who were diagnosed with axSpA. This compares with 39.5% and 54.9% of cases when assessed by a majority and 2 central readers, respectively.
Discordance between central and local readers for detection of active lesions (MRI Q1) was recorded in 46 (17.8%) and 47 (18.2%) of cases according to 2 and majority (≥4/7) central reader data, respectively (kappa (95%CI) of 0.64 (0.54-0.73) and 0.62 (0.53-0.72)) ( Table 5). With central reading as external standard the false-positive rate for active lesions was 27.4% and 33.3% ('local overcall') for 2 and majority reader data, respectively. Reliability between the 7 central readers was higher with a median kappa value of 0.74 and range of 0.63-0.83 for all possible reader pairs (supplementary Table 4). Discordance between central and local readers for detection of structural lesions (MRI Q2) was noted in 66 (30.0%) and 67 (30.5%) of cases according to 2 and majority (≥4/7) central reader data, respectively (kappa (95%CI) of 0.44 (0.32 to 0.55) and 0.38 (0.25 to 0.50)). Local versus central reader discrepancies were less evident when only data from DICOM scans was assessed ( Table 5).

Classification of Axial SpA
There were 159(63.1%) patients who fulfilled the ASAS axSpA criteria based on local-reading, and 148 (58.7%) and 143 (56.7%) patients based on 2 and majority central-reading, respectively (Table 6). A total of 19 (7.5%) and 20 (7.9%) patients who were classified as axSpA after local reading were re-classified as not having axSpA after 2 and majority reader central evaluation.
Conversely, 8 (3.2%) and 4 (1.6%) cases who were classified as having axSpA after 2 and majority reader central evaluation, respectively, would have been re-classified as not having axSpA after local assessment. The numbers were similar when fulfillment of the imaging arm was the primary consideration (irrespective of the clinical arm).

Lesions (MRI Q2) on classification of axSpA
In total, 120 (55.3%) cases fulfilled the axSpA criteria based on local reading of radiographic sacroiliitis and central reading of active inflammation on MRI. This changed to 125 (57.6%) and 117 (53.9%) of cases after replacement of radiographic sacroiliitis by 2 and majority central reader MRI structural lesions, respectively (Table 6). A total of 9 (4.1%) and 4 (1.8%) cases who were classified as not having axSpA were re-classified as having axSpA after replacing radiographic sacroiliitis with 2 and majority reader MRI structural lesions, respectively. Conversely, 7 (3.2%) and 8 (3.7%) cases were re-classified as not having axSpA after substitution by 2 and majority reader MRI structural lesions, respectively. The numbers were similar when fulfillment of the imaging arm was the primary consideration (irrespective of the clinical arm). This is the first report that describes the frequencies of the broad spectrum of active and structural MRI lesions according to recently published ASAS definitions in patients presenting to the rheumatologist with undiagnosed back pain. Active or structural lesions typical of axSpA were observed by a majority of central readers in 55% of patients diagnosed by local rheumatologists with axSpA but also in 12.9% of non-axSpA cases suggesting that axSpA may have been under-recognized by local rheumatologists.

DISCUSSION
Subchondral bone marrow edema was observed in about 50% of cases diagnosed with axSpA although the definition of an ASAS positive MRI was met in only 40%. The corresponding frequencies in non-axSpA cases were 13.2% for subchondral BME and 2.6% for an ASAS positive MRI. This is much lower than the 20-40% frequency often cited for an ASAS positive MRI in controls, both healthy and those diagnosed with non-specific back pain, in other cohorts [8][9][10] . This could be explained by central reader expertise in distinguishing BME lesions suggestive of axSpA versus non-specific findings and also the concomitant presence of structural lesions.
It reinforces the importance of contextual interpretation of T1W and fat-suppressed scans for diagnostic interpretation of MRI scans previously emphasized in an ASAS consensus exercise 3 .
The revised ASAS definition of erosion was highly discriminatory and was detected in fewer than 10% of non-axSpA cases in both reading exercises although sensitivity of 30-40% was lower than the 50-60% reported in some previous studies of MRI in axSpA 17,18 . This may reflect differences in the definition of erosion. The first ASAS publication on MRI definitions in the SIJ cited only the requirement for a bony defect at the joint margin without specifying alteration in the signal from adjacent bone marrow 2 . The revised ASAS definition stipulates both a bony defect as well as loss of the adjacent bright marrow signal observed on a T1W sequence 11 . Fat lesion with the distinct features of axSpA, namely a sharp border and homogeneous increased T1W signal, was also discriminatory but sensitivity was less than for erosion at 25-30% while specificity was 90-95%, which was comparable to findings in other cohorts of early SpA that applied a similar definition [18][19][20] .
We observed local reader overcall in the range of 25-35% when using the central reader assessment as external standard raising the possibility of diagnostic overcall. However, this had little impact on the number of patients classified with axSpA since patients could still be classified as axSpA by the clinical arm. Conversely, local readers detected fewer structural lesions than central readers. This could reflect the requirement for good quality T1W images so that the more complex structural lesions can be adequately visualized as the discrepancy was less evident when DICOM images were assessed. Nevertheless, substitution of radiographic sacroiliitis by structural lesions on MRI detected by central readers had a minor impact on the number of patients classified as axSpA. This may not be surprising as most patients with structural lesions also had active lesions typical of axSpA. Similar observations have been reported in two early axSpA cohorts 21,22 .
There are some limitations of our data. It has been over a decade since the local MRI reads were conducted and it is possible that discrepancy might be less evident if the study was a contemporary comparison. However, recent clinical trials of non-radiographic axSpA 23,24 have reported similar symptom duration prior to diagnosis as noted for the ASAS-CC suggesting that diagnostic delay has not changed a great deal over the past decade and that imaging findings may therefore not be different. Interpretation of local reader data is compromised by lack of data recorded in the ASAS-CC CRF as to which types of MRI lesion were observed. The assessment of structural lesions, especially erosion, is increasingly being performed using MRI sequences that can enhance the contrast between the joint space and bone 25 .
In conclusion, our analysis of MRI scans from patients referred to rheumatologists with undiagnosed back pain demonstrates the importance of both active and structural lesions in diagnostic decision making and the importance of educational initiatives aimed at enhancing interpretation of these lesions. These data also provide reassurance that the ASAS classification criteria have performance characteristics that may circumvent the limitations posed by the widespread lack of reader expertise in the interpretation of MRI scans.
However, our study design was retrospective in nature and could not assess the impact of reader discrepancy on diagnostic ascertainment. Consequently, the performance of the ASAS criteria will require further testing in a study design where the impact of differences in interpretation of imaging on diagnostic ascertainment can be addressed.