Stratification of Patients With Sjögren’s Syndrome and Patients With Systemic Lupus Erythematosus According to Two Shared Immune Cell Signatures, With Potential Therapeutic Implications

Similarities in the clinical and laboratory features of primary Sjögren’s syndrome (SS) and systemic lupus erythematosus (SLE) have led to attempts to treat patients with primary SS or SLE with similar biologic therapeutics. However, the results of many clinical trials are disappointing, and no biologic treatments are licensed for use in primary SS, while only a few biologic agents are available to treat SLE patients whose disease has remained refractory to other treatments. With the aim of improving treatment selections, this study was undertaken to identify distinct immunologic signatures in patients with primary SS and patients with SLE, using a stratification approach based on immune cell endotypes.


INTRODUCTION
Primary Sjögren's syndrome (SS) and systemic lupus erythematosus (SLE) are chronic autoimmune rheumatic diseases that primarily affect women and that share common characteristics, including genetic, as well as clinical and serologic characteristics (1). Although significant progress has been made toward improving treatment and patient-related outcomes in primary SS and SLE, there is still a need for improvement in early diagnosis and adequate therapy monitoring, as well as new treatments for manifestations refractory to approved therapies and better strategies to address comorbidities (1).
Primary SS and SLE share etiopathogenic links. Both diseases are associated with a large number of major genetic susceptibility loci, such as HLA class II variants BLK, IRF5, and STAT4 (2)(3)(4), while neutrophil degranulation was identified as the most significantly enriched functional epigenetic pathway in both diseases (5). In addition, a gene expression meta-analytic strategy identified transcriptomic similarities comprising overexpressed genes related to interferon (IFN)-mediated signaling pathways as well as pathways mediated by other cytokines, and similar responses to viral infection (6). The IFN signature, defined as an increased expression of type I IFN-regulated genes, has been shown to be associated with increased disease activity in both SLE and primary SS (7,8). SLE and SS are also characterized by common environmental factors (9,10), aberrant B cell (11) and T cell activation (12,13), and autoantibody production (14,15), which are reflected in the similar therapeutic approaches (16,17).
However, the clinical evolution of both primary SS and SLE is difficult to predict, as patients present at different stages in the course of their disease with diverse clinical manifestations. This suggests that distinct pathways driving chronic inflammation and immune dysregulation in primary SS and SLE are activated at a certain point in the disease course (18,19). Therefore, recognizing the underlying molecular and cellular abnormalities characterizing patient-specific disease manifestations could identify markers for disease course prediction and tailored treatment strategies.
Previous efforts to stratify patients with SLE based on gene expression identified different mechanisms of disease progression, as well as distinct clinical manifestations (20,21). Similarly, research into stratification of patients with primary SS revealed distinct patient clusters driven by an association between activated CD4+ and CD8+ T cell signatures, disease activity and glandular inflammation (22), presence or absence of SSA/SSB antibodies, presence or absence of various HLA genetic markers (23), or distinctive clinical phenotypes (24). Recognizing that immune signatures, rather than the diagnostic label in certain patients, are likely to be more important in defining the disease, researchers recently proposed a molecular taxonomy-derived reclassification of autoimmune rheumatic diseases to reflect their pathogenesis and support better patient selection for clinical trials (the PRECISESADS project) (25).
Our hypothesis is that patients with primary SS and patients with SLE share immunologic features that span diagnostic boundaries, and recognition of these features could support the development of personalized medicine strategies and thus lead to better treatment selection. In particular, we suggest that stratification based on immune cell phenotype between certain groups of patients with primary SS and patients with SLE could support the implementation of similar therapeutic strategies (e.g., use of treatments licensed for SLE in patients with primary SS with similar immunologic makeup). Furthermore, we propose a new approach of including patients with an overlapping clinical phenotype and features of both diseases, such as patients with secondary SS associated with SLE (SLE/SS), which account for 14-17.8% of SLE patient cohorts (26,27).
Using machine learning approaches in a mixed cohort of patients with primary SS, those with SLE, and those with SLE/ SS, we established 2 new disease endotypes based on peripheral blood immune signatures. Results were predictive of characteristic long-term disease activity and damage trajectories.

PATIENTS AND METHODS
Study subjects. Peripheral blood was obtained from patients with primary SS (n = 45), patients with SLE (n = 29), and patients with SLE and secondary SS (n = 14) who were recruited from the Autoimmune Rheumatic Diseases Clinic at the University College London Hospitals NHS Foundation Trust. Patients with primary SS or SLE/SS satisfied the American-European Consensus Group criteria for SS (28). All SLE patients fulfilled the revised Systemic Lupus International Collaborating Clinics (SLICC) criteria for SLE (29). Table 1 shows baseline clinical and demographic characteristics of the patient cohorts. Healthy controls with no symptoms of dryness (n = 31; mean age 44 years, range 20-77 years) were also recruited, matched for sex (all participants were women) and ethnicity. All subjects were enrolled in accordance with ethics regulations approved by the National Research Ethics Service Committee South East Coast-Surrey (reference no. 14/LO/2016) following written informed consent. Peripheral blood mononuclear cells (PBMCs) were isolated from peripheral blood using Ficoll-Hypaque density-gradient centrifugation. A detailed description of data collection methodology is available in the Supplementary   on an LSRII flow cytometer (Becton Dickinson), and data were analyzed using FlowJo software (Tree Star).

Statistical analysis.
The study design and statistical analyses are summarized in Figure 1. Analysis of the demo graphic data was performed using GraphPad Prism software version 8. In each group, values are expressed as the mean and range or median and interquartile range, depending on data distribution, which was tested using the Kolmogorov-Smirnov test. Nonparametric 2-tailed Mann-Whitney test, Kruskal-Wallis test, and Dunn´s multiple comparison test were performed. Categorical variables were compared using chi-square tests. Correlation analyses of nonparametric data were performed using Spearman's correlation tests. P values less than 0.05 were considered significant. Data, including demographic data, immunophenotyping data, and longitudinal clinical data, were stored in Microsoft Excel.
The immunophenotyping data were compared between the different populations including healthy controls, those with SS, those with SLE, those with SLE/SS, and the stratified patient groups. Other statistical analyses were performed in R version 3.5.2 (https://www.R-proje ct.org/).

Logistic regression for association analysis.
The association between the immunophenotypes of 29 parameters and patient groups was assessed, adjusted for age and ethnicity. For each measurement, the odds ratio (OR) and the 95% confidence interval (95% CI) were determined, and the P value was calculated. Forest plots were produced with the ggplot2 package in R, with significant associations highlighted in red (P < 0.05).
Machine learning approaches. Supervised machine learning approaches, balanced random forest plots, and sparse partial least squares discriminant analysis were applied for classification and parameter identification. A balanced random forest model was used for classification and variable selection using the randomForest package in R. A balanced random forest is an ensemble machine learning algorithm for classification, consisting of numerous decision trees that can increase model accuracy while minimizing the risk of model overfitting, which is often encountered in rare data sets from smaller cohorts; thus, this approach has been employed as a way to obtain validated data from smaller samples (30). Parameters were optimized for the best outcome in each model. A detailed description of the machine learning models and data analysis platforms is available in the Supplementary Methods (available on the Arthritis & Rheumatology website at http://onlin elibr ary.wiley.com/ doi/10.1002/art.41708/ abstract).
Clinical trajectory analysis. The trajectories of patient clinical measures over time (expressed as visits/year; n = 5) are depicted by a spaghetti plot. The flow of the longitudinal data of patients (those with SS, those with SLE, and those with SLE/SS; n = 88) is shown in each plot, where each line represents one parameter from each patient. Smoothing lines were added to indicate the trend of patient groups as identified from K-means clustering analysis. Plots were produced using R package "ggplot2."

Similar immunologic architecture comprising a shared immune signature in patients with primary SS and patients with SLE.
We compared routinely available clinical information from patients with primary SS, those with SLE, and those with SLE/SS to determine whether it could be used to identify similarities and differences between the patient groups irrespective of diagnosis (Table 1). Patients with primary SS were older (mean age 59 years, range 30-78 years) compared to patients with SLE (mean age 48 years, range 21-72 years) and patients with SLE/SS (mean age 55 years, range 26-56 years). All patients were women. Disease activity scores were not different between the patient groups when comparing the EULAR Sjögren's Syndrome Disease Activity Index (ESSDAI) (31) and Systemic Lupus Erythematosus Disease Activity Index 2000 (SLEDAI-2K) (32) scores, as applicable. Of note, the majority of patients included in this study had low or no disease activity. In comparing the SLICC/American College of Rheumatology   The 3 patient groups were also strikingly similar in most other clinical and laboratory features, except in the comparison of disease duration, which was significantly longer in patients with SLE compared to patients with primary SS. Anti-Ro and   anti-La autoantibodies and rheumatoid factor were more common in patients with primary SS compared to patients with SLE. Frequency of treatment with conventional disease-modifying antirheumatic drugs differed significantly among the 3 patient populations. This reflects current practice: fewer patients with primary SS were treated with these agents, as the evidence of their efficacy is very limited. Only 14% of SLE patients had received rituximab 4-16 years before blood samples were collected.
To assess whether immune cell phenotyping could be used to stratify patients within the 3 different autoimmune diseases, 29 different B cell, CD4+ T cell, and CD8+ T cell subsets were examined (See Figure 1 for the analysis strategy and Supplementary Figure 1 (34,35).
However, when comparing the immune profiles of patients with primary SS and patients with SLE using a variety of statistical  and machine learning approaches, very few statistically significant differences were observed between the 2 cohorts ( Figure 2). Only 5 of 29 immune cell subsets had differential frequencies between patients with primary SS and patients with SLE, as determined by the Mann-Whitney test, Kruskal-Wallis test, and a univariate logistic regression analysis: transitional mature B cells (Bm2′), late memory mature Bm5 cells, IgD-CD27-B cells, and CD8+ naive T and effector memory T (Tem) cells (Figures 2A-C). These findings were confirmed using machine learning approaches, with the optimized balanced random forest model showing a poor performance of these immune cell profiles in distinguishing between primary SS and SLE (area under the curve [AUC] 0.7096) ( Figure 2D). Results from the sparse partial least squares discriminant analysis model showed a large overlap between the immune cell profiles of patients with primary SS and those with SLE ( Figure 2E). Together, the results of these comprehensive comparison analyses suggest that while patients with SLE and those with SS had multiple significant immune phenotype differences compared to healthy controls, few statistically significant differences in the immune phenotype were observed between patients with SLE and those with SS, despite the patients having different clinical presentations and diagnoses.   and Supplementary Figure 5, available on the Arthritis & Rheumatology website at http://onlin elibr ary.wiley.com/doi/10.1002/ art.41708/ abstract). Furthermore, a correlation analysis of immune cell frequencies revealed significant differences in immune cell associations between group 1 and group 2 ( Figure 3B). To support these findings, a univariate logistic regression analysis was performed. Nearly half of the immune cell subsets (13 of 29) showed significant alterations in their frequencies between groups ( Figure 4A). These results were further confirmed using machine learning approaches, in which the optimized balanced random forest model, with classifications assessed using 10-fold cross-validation, yielded an AUC of 0.9942 for distinguishing between the 2 patient groups ( Figure 4B).

Two groups of patients identified as having shared immune signatures across primary SS, SLE, and SLE/
The top contributing immune features ranked using the mean decrease in Gini coefficient suggested a strong divergence of CD8+ T cell subsets between patients in group 1 and patients in group 2, including CD8+CD25-CD127-, CD8+ responder (CD127+CD25-), CD8+ Temra, CD8+ naive, CD8+ Tem, and total CD8+ T cells ( Figure 4C). Balanced random forest classification models performed better when discriminating between group 2 and healthy controls (AUC 0.8999) compared to discriminating between group 1 and healthy controls (AUC 0.7749), suggesting that patients in group 2 had more aberrant immune cell profiles compared to healthy controls than did patients in group 1 ( Figure 4B). Sparse partial least squares discriminant analysis also showed a clear separation between the 2 patient groups ( Figure 4D) and identified similar immune cell subsets as being important in driving the group 1 stratification compared to group 2 ( Figure 4E). Comparison of the results from multiple analysis approaches revealed that 8 immune cell subsets were common to all 4 analysis methods: total CD4+ and CD4+ Temra T cells, total CD8+ and CD8+ naive, Tem, Temra, responder T cells, and CD25-CD127-T cells (Supplementary Table 1 In addition, the accuracy of the classification models was maintained at 96.16% in the 10-fold cross-validation analysis. Thus, despite patients with primary SS and those with SLE having low or no disease activity, these patients could still be stratified using their immune cell profile (Supplementary Table 2, available at http:// onlin elibr ary.wiley.com/doi/10.1002/art.41708/ abstract). These findings suggest that differences in global immunologic features in these patients are a reflection of the underlying immunopathogenesis of shared pathogenesis, rather than being a reflection of the level of disease activity or the specific disease diagnosis.
Increased disease activity in patients in group 2. To assess whether the distinct immunologic profiles also reflect differences in clinical and disease features, laboratory markers (including anti-Ro and anti-La autoantibodies and rheumatoid factor), disease activity and damage scores, and treatments were compared between patients in group 1 and group 2 at the time of sample collection (Supplementary Table 3 and Supplementary Figure 7, available at http://onlin elibr ary.wiley.com/doi/10.1002/art.41708/ abstract). Patients from both groups had had similar disease outcomes and serologic biomarker levels overall, although patients in group 2 had a significantly elevated erythrocyte sedimentation rate (ESR), decreased hemoglobin (Hgb) levels, and increased ESSDAI scores compared to patients in group 1, suggesting that the disease state was more active at baseline in group 2, although disease activity was still predominantly low overall. In addition, frequencies of different therapies were not significantly different between SLE patients in group 1 and SLE patients in group 2 (SLE patients have more treatment options compared to the options available for patients with primary SS) (Supplementary Table 4, available at http://onlin elibr ary.wiley.com/doi/10.1002/art.41708/ abstract), thus suggesting that the identified immune cell signatures were not driven by differences in treatment, but rather could reflect the underlying disease pathogenesis.
To further investigate whether the grouping was clinically meaningful as a potential predictor of disease course, a wide range of clinical measurements were collected longitudinally at 5 subsequent annual encounters, including serologic markers and disease-specific outcome measures. Individual patients' disease trajectories for these assessments were compared between group 1 and group 2. Over the 5-year clinical encounter timeframe, patients in group 2 had overall higher disease activity compared to patients in group 1. Although, as expected, disease activity fluctuated over time in the SLE and SLE/SS patient groups (measured using the SLEDAI-2K), despite patients having low disease activity (SLEDAI-2K <3), there was a general trend toward more active SLE in group 2. The ESSDAI scores were characterized by less fluctuation over time and were marginally increased in the patients with primary SS and those with SLE/SS from group 2. Interestingly, all patients in group 2, irrespective of diagnosis, had increased SDI damage scores ( Figure 5A and Supplementary Figure 8, available on the Arthritis & Rheumatology website at http://onlin elibr ary. wiley.com/doi/10.1002/art.41708/ abstract). Patients in group 2 overall also had decreased Hgb levels and elevated ESR, which corresponded to their slightly more increased disease activity ( Figure 5B). No other laboratory biomarkers had the capacity to discriminate between patients in group 1 and group 2.

Correlations between immune cell subtypes and baseline clinical measurements.
To assess whether the distinct immune cell profiles identified across the 3 disease phenotypes were associated with distinct clinical features, a correlation analysis was performed within the mixed patient population. Correlations between the immune cell frequencies and clinical characteristics of patients in each group were calculated using Pearson's correlation coefficients (Supplementary Figure 9, available on the Arthritis & Rheumatology website at http://onlin elibr ary. wiley.com/doi/10.1002/art.41708/ abstract). In concordance with the baseline immune cell phenotype characterization and trajectory analysis, ESR was significantly correlated with 4 CD8+ T cell subtypes, 3 CD4+ T cell subtypes, and 2 B cell subpopulations, which overlapped with the cell subsets driving the K-means clustering of patients in groups 1 and 2. Hgb level only correlated positively with the frequency of CD8+ Tcm cells in the mixed patient population. Disease damage scores across the mixed patient populations significantly correlated with CD8+ T cell frequencies, including CD8+CD25-CD127, CD8+ responder T cells, and CD8+ Temra cells, which were the top ranked immune features from the machine learning models.

DISCUSSION
We propose new classification for patients with primary SS, those with SLE, and those with SLE/SS based on unique peripheral blood immune signatures that are predictive of distinct longterm disease activity and damage trajectories in those with low or no disease activity. The 2 patient groups (endotypes) spanning the diagnostic boundaries we describe here are robust, as they have been derived from a complex analysis with several crossvalidation steps.
Even if initial characterization of the 3 disease phenotypes included in our analysis showed differences in age and disease duration, as well as serologic markers and treatment, as previously reported in another study in patients with primary SS, those with SLE, and those with SLE/SS (36), we have shown for the first time that patients with primary SS and patients with SLE with low-to-moderate or no disease activity have very few significant differences in immunologic architecture. This comprised differences in 5 of 29 immune cell subsets, which included transitional Bm2′ cells, late memory Bm5 cells, IgD-CD27-B cells, and CD8+ naive and CD8+ Tem cells. Previous immunophenotyping studies in primary SS indicated a predominance of naive B cells, as well as lower frequencies and absolute numbers of memory B cells (37,38) and opposite trends in SLE (39), findings which were replicated in our study as well. The role of T cells in the pathogenesis of both primary SS (12) and SLE (13) has been established in the literature. SLE is associated with T cell functional alterations and increased effector and decreased regulatory T cell responses, while an | 1635 overall shift toward Th1 phenotype activation has been previously identified in primary SS.
Our analysis identified 2 new disease endotypes within our mixed cohort, which were characterized by differential immune signatures that had a higher capacity for discriminating between patients than the immune signatures associated with the diagnostic label (receiver operating characteristic curve 0.99 compared to 0.70). These findings highlight the shared immunopathogenic processes underlying primary SS and SLE manifestations that are likely to be more relevant for treatment selection strategies than basing treatment selection on disease diagnosis alone. In addition, the altered immune landscape associated with the 2 endotypes had predictive value for determining long-term disease trajectories related to disease activity and damage.
Previous patient stratification approaches in primary SS and SLE were mainly directed at cohorts of patients with the same diagnosis, despite the use of shared treatment strategies across many autoimmune rheumatic diseases. However, potential biomarkers shared by different autoimmune diseases have been described, including an expanded CD8+ memory T cell population associated with poor prognosis in both small vessel vasculitis and SLE (40) or elevated expression of genes related to CD8+ T cell responses, which correlated with poor prognosis in Crohn's disease and ulcerative colitis (41). This suggests that exploring biomarker commonalities within autoimmune diseases could expand the understanding of their potential shared pathogenic mechanisms. Prior efforts to elucidate the molecular heterogeneity of SLE revealed that IFN signatures are associated with disease activity (21,42) and the enrichment of neutrophil transcripts during the progression to active nephritis (21) also revealed transcriptional fingerprints that were shared across various autoimmune, inflammatory, and infectious diseases and were found to be associated with SLE disease progression (43). Our future studies will focus on exploring the role of these signatures in our patient groups.
Several B cell-targeted biologic therapies have been separately investigated in both patients with primary SS and patients with SLE (16,44). However, the only licensed anti-B cell biologic therapy for SLE (belimumab) is only approved for use in patients with nonrenal SLE manifestations (45) and has no proven clinical efficacy in primary SS, despite findings showing that this treatment normalizes the B cell frequency, phenotype, and function in patients with primary SS (46). Anti-CD20 monoclonal antibody therapy failed to meet the primary end point evaluated in randomized controlled trials in primary SS or SLE, despite being associated with some benefits (47,48) and being proven effective in other studies and case series (49)(50)(51). Exciting data have recently emerged regarding the potential clinical efficacy of a new biologic therapy for primary SS, ianalumab, which has a dual mode of action combining BAFF receptor inhibition and B cell depletion (52). Therefore, to date, the limited therapeutic success in primary SS and SLE emphasizes the need to rethink the way that treatment targets are selected, in order to pinpoint the role of shared pathogenic communalities across diseases, rather than selecting patients based on diagnostic labels or composite measures of disease activity.
The likely impact of our findings will include a new classification of patients with primary SS based on one of the two immune signatures derived from this analysis, using a simplified immunologic toolkit that includes the immune markers that drove patient clustering in group 1 compared to group 2. As patients included in group 1 had better outcomes based on disease trajectories, with no difference in medications used, and also had CD4+:CD8+ T cell ratios within normal range (compared to significantly increased ratios in group 2; P < 0.0002), we can hypothesize that patients with a group 2 immune signature across the 3 disease phenotypes could benefit from treatment with mycophenolate mofetil (which has been shown to restore the significantly lower CD4+:CD8+ T cell ratio associated with SLE in patients who responded to treatment with mycophenolate mofetil [53]). Also, since treatment with belimumab is associated with the depletion of naive and transitional B cells in patients with primary SS who responded to therapy (46), we could hypothesize that patients with primary SS stratified in group 1 are more likely to respond to belimumab, as they have an increased transitional B cell (Bm2′) signature.
Further research, including patient stratification, using the identified signatures to determine inclusion in interventional clinical trials of therapies that predominantly target B cells (rituximab, belimumab) compared to T cells (abatacept) is required to establish if the signatures we identified have predictive biomarker values for responses to certain therapies. In addition to stratifying patients for better treatment selection, our results can offer new therapeutic options for patients with primary SS who share immune signatures with selected SLE patients, by providing access to treatments licensed for use in SLE. This can lead to changes in clinical practice through the implementation of best-evidence personalized treatment strategies derived from interventional clinical trials using the stratification tool we are proposing here, to improve the benefit to the patient and justify access to existing SLE treatments for selected patients with primary SS.
Although patients included in our analysis have wellcontrolled or mild-to-moderately active disease, the disease trajectory analysis identified differences in accumulated damage over time between the 2 groups and higher ESSDAI scores in group 2, suggesting that closer monitoring may be required for patients with a group 2 immunologic signature. Investigating the immune signatures associated with various severe organ and system flares, as well as exploring the immune signatures present in target-organ tissue biopsy specimens, were beyond the scope of this study, as this would have introduced additional confounding factors and would have required a much larger sample size.
Our study has certain limitations: the patients were all women and all had well-controlled or mild-to-moderate disease activity; therefore, we were unable to evaluate the influence of sex bias or the impact of high disease activity or severe flares on the identified immune signatures. This was an exploratory study; therefore, corrections for multiple comparisons were not performed so that potentially important markers would not have been excluded; thus, Type 1 family-wise errors could occur for some analyses. External validation will be required to assess whether the identified signatures can be reproduced in a study with a larger sample size, as well as to account for potential Type 1 family-wise errors and investigate whether other immune signatures, which this study had no statistical power to detect, could be identified. While in this study we stratified patients based on statistically significant differences in immune cell phenotype frequencies, functional experimental work is needed to categorically define whether patients in the stratified groups are immunologically similar, which will be the focus of future studies.
In conclusion, we propose the reclassification of patients with primary SS, patients with SLE, and patients with SLE/SS based on an immune cell toolkit comprising a limited immune cell set that can differentiate patients with high accuracy. Our results demonstrate that selection and validation of patients using machine learning approaches could be proven to be a suitable strategy to select patients for targeted therapeutic approaches.