Personalized prediction of disease activity in patients with rheumatoid arthritis using an adaptive deep neural network

Background Deep neural networks learn from former experiences on a large scale and can be used to predict future disease activity as potential clinical decision support. AdaptiveNet is a novel adaptive recurrent neural network optimized to deal with heterogeneous and missing clinical data. Objective We investigate AdaptiveNet for the prediction of individual disease activity in patients from a rheumatoid arthritis (RA) registry. Methods Demographic and disease characteristics from over 9500 patients and 65.000 visits from the Swiss Quality Management (SCQM) database were used to train and evaluate the network. Patient characteristics, clinical and patient reported outcomes, laboratory values and medication were used as input features. DAS28-BSR served as a target to predict active RA and future numeric individual disease activity by classification and regression. Results AdaptiveNet predicted active disease defined as DAS28-BSR >2.6 at the next visit with an overall accuracy of 75.6% (SD +- 0.7%) and a sensitivity and specificity of 84.2% (SD +- 1.6%) and 61.5% (SD +- 3.6%), respectively. Prediction performance was significantly higher in patients with a disease duration >3 years and positive rheumatoid factor. Regression allowed forecasting individual DAS28-BSR values with a mean squared error (MSE) of 0.9 (SD +- 0.05). This corresponds to a 8% deviation between estimated and real DAS28-BSR values. Compared to linear regression, random forest and support vector machines, AdaptiveNet showed an increased performance of over 7% in MSE. Medication played a minor role in the prediction of RA disease activity. Conclusion AdaptiveNet has a superior capacity to predict numeric RA disease activity compared to classical machine learning architectures. All investigated models had limitations in low specificity.


INTRODUCTION
Rheumatoid arthritis (RA) is a chronic inflammatory disorder in which disease activity fluctuates over time. The advent of targeted synthetic and biologic medication, along with early and treat-to-target strategies have substantially improved patient care. However, sustained remission still is only achieved in around 30% indicating room for improvement either by new drugs or alternative treatment strategies 1 . EULAR/ACR recommendations suggest treatment modification after three to six months if the set target is not reached, regardless of the presence or absence of individual risk factors for poor outcome 2 . Given the increasing number of available drug combinations, the delay in finding the best individual treatment can be substantial. Despite the advent of new biomarkers, their practical role to predict individual chances of good therapeutic response remains limited 3,4 . There are also no clear recommendations on treatment de-escalation in case of stable disease 5 despite disease activity-guided dose optimisation of biologic being efficient and cost-effective 6 . In other words, over-or undertreatment in RA is common, potentially resulting either in destructive disease flares or unnecessary side effects and costs 7 .
Machine learning (ML) is a relatively new approach for disease detection, disease stratification and disease prediction both in at risk populations and established disease 8 .
Using data from electronic medical records (EMR), ML has successfully predicted RA flares in a small number of RA patients by a random forest, as basic machine learning method 9 .
Norgeot et al. applied deep learning 10 (DL) as a newer subfield of ML to EMR data in 820 RA patients for the prediction of disease activity by classification 11 . To predict the category of low disease activity, a remarkable AUC score of 0.91 was achieved in a test set of 116 patients.
Medication in this setting could not be assessed due to the incomplete EMR dataset. Using the Swiss Quality Management (SCQM) database 12 for rheumatic diseases, we recently described a novel adaptive deep neural network being superior to conventional DL architectures in the prediction of disease activity in a larger dataset of 9500 RA patients 13 .
The study presented here aims to characterize this deep neural network in RA patients, to forecast individual disease activity both categorically and numerically as a potential tool for assistance in clinical decision.

Dataset
The dataset used is the Swiss Clinical Quality Management in Rheumatic Disease (SCQM) registry, a national multicenter database containing longitudinal data from clinically diagnosed RA patients. Patients are followed-up with one to four visits yearly including . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted September 5, 2020. . https://doi.org/10.1101/2020.09.03.20168609 doi: medRxiv preprint clinical, radiographic and patient reported outcome data. Characteristics of the database are described elsewhere 12 . The collection of patient data for the SCQM register was approved by a national review board and all individuals willing to participate sign an informed consent form before enrolment, in accordance with the Declaration of Helsinki.

Input features and prediction target
To predict disease activity, we used DAS28-BSR as target variable and considered only visits with complete scores. We used age, gender, weight, disease duration, BSR, CRP, swollen joint count, painful joint count, rheumatoid factor, anti-CCP, treatment, smoking status, HAQ, morning stiffness, EuroQol, disease activity and pain level as potential predictors. For antirheumatic therapy, we used the individual drugs, as well as broader drug categories (biologic, csDMARDs and prednisone dose strata) and duration of therapy since adjustment. For training and evaluation of the predicted target variable we considered followup visits between 1 month and 1 year. As input features, we considered the visit and medication data of the last 5 years.

Classification and regression
For classification, we defined two disease states, active disease (DAS28-BSR > 2.6) and remission (DAS28-BSR ≤ 2.6) at next visit. Prediction performance was measured by accuracy, sensitivity, specificity and area under the curve (AUC) score. In order to predict numeric values of the target variable (DAS28-BSR), we applied a regression model and predicted the expected change of DAS28-BSR to the subsequent visit. Performance was measured by mean squared error (MSE) as an estimator of the deviation between the estimated and actual values. To evaluate the models, we split the dataset into a training set and a validation set by using 5-fold cross-validation.

Modelling
Classification and regression was performed with AdaptiveNet, a dynamic and recurrent deep neural network architecture, designed for chronological clinical data 13 . In short, AdaptiveNet encodes all former clinical events of a patient (here: visits and medication adjustments) to the same latent space using multiple fully-connected encoder networks in order to align the corresponding output vectors ( Figure 1). Sorted lists of these encoded clinical events are pooled by a long short-term memory 14 (LSTM) to compute a fixed-length encoding, representing the 5-year patient history. The final output is computed by a fullyconnected network module, using the encoded patient history and additional features containing general time-independent patient information as input. Preprocessing, architecture, learning rate, optimizer and batch size are described in Hügle et al. 13 . We used . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 5, 2020. . https://doi.org/10.1101/2020.09.03.20168609 doi: medRxiv preprint loss of MSE for regression and binary cross-entropy for classification. To estimate feature importances, we additionally trained a random forest 15 with a maximum depth of 10.

Categorical prediction of disease activity by classification
In total, 28.601 visits with corresponding disease activities were extracted. Over a maximal Data from patients aged >50 years achieved a higher accuracy and sensitivity to predict active disease than aged <50. Anti-CCP positive status only showed a marginally better learning performance (AUC 0.73 vs. 0.72) for this task (Table 1).

Numerical prediction of disease activity by regression
AdaptiveNet was applied to predict the numerical DAS28-BSR value at the next visit by regression on an individual level. When trained on data from all patients, we obtained an overall MSE of 0.9 which corresponds to a 8% deviation between estimated and real DAS28-BSR values.  (Table 1). In contrast to classification, regression had lower MSE values and thus performed better in female and RF-negative patients.

Feature importance
Feature importance was determined to define the relative importance of variables for disease prediction (Supplement 1). Apart from the target variable itself, the number of painful joints, longer disease duration and age turned out to be the most relevant factors, followed by medication in general, time point of last medication adjustment, number of swollen joints, and HAQ. The importance of medication type (csDMARD vs. bDMARD or corticosteroids) for the . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 5, 2020. . https://doi.org/10.1101/2020.09.03.20168609 doi: medRxiv preprint prediction of DAS28-BSR was only marginal. Infliximab, tozilizumab and steroids had a slightly higher influence than csDMARDs or other bDMARDs in predicting disease activity.

DISCUSSION
This study demonstrates a comprehensive classification and regression analysis using deep learning on a large RA dataset. This ML technique allowed to generate individual predictions of subsequent DAS28-BSR values, as shown in Figure 3, instead of disease states alone, which might facilitate the application of DL predictions in the clinic. Thus, DL could foster personalized medicine, e.g. to assist in setting control intervals, and for (de)escalation of treatment. As a further new finding we describe that long disease duration, age>50 and antibody positivity increase the predictability of the active disease. This information could be of importance e.g. for patient selection in future ML-assisted clinical trials. In contrast to classification, the prediction of numeric DAS28-BSR by regression performed better in female than male and anti-CCP positive than in anti-CCP negative patients. We postulate that this difference is due to the fact that classification tasks are prone to overfit to the old class, i.e. predicting no change to the previous situation. In this case, this means that patients in remission for a long period likely will stay in remission, or vice versa, patients resistant to multi-line treatment will more likely remain in active disease. This might also apply to similar classification results shown in other studies 11 . We also performed the classification task to predict DAS28-BSR in-or decrease at next visit (data not shown). This task had a lower accuracy, likely because small fluctuations of DAS28-BSR values are more difficult to predict. Therefore we speculate that that regression could be a more adequate prediction tool for ML-assisted care than classification. Variations of 8% between estimated and real DAS28-BSR values seem acceptable results to implicate disease prediction in the clinical practice. Independently, variable and noisy data in medical databases remain a major challenge, both in EMR and in registry datasets. Advanced architectures like AdaptiveNet improve the performance of prediction in such data compared to classical ML methods 13 e.g. by taking into account the timing between visits and therapy initiation. The relatively low specificity, however, shows further room for improvement. Taking into account larger datasets through -omics or digital biomarker e.g. by wearables and more patient reported outcomes might further improve the results of disease prediction 16 . To some extent surprising, medication was less important for the prediction of disease activity than age or disease duration. The reason for this might be explained by limited effectiveness after multiline treatments or vulnerability of DAS28-BSR as target variable to confounding factors as e.g. fibromyalgia. The slightly higher performance of infliximab to forecast disease activity is . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 5, 2020. . https://doi.org/10.1101/2020.09.03.20168609 doi: medRxiv preprint reasonable from a clinical perspective by intravenous application and dose. Whether DL is able to predict drug survival or individual treatment responses needs to be clarified. Further studies also need to investigate the performance of DL using alternative input and target features including other markers for disease activity than DAS28-BSR. Taken together, we are convinced that DL will play an increasing role to improve patient care and to foster personalized treatment and shared-decision making in patients with RA. Prospective trials will be necessary to prove efficacy, safety and cost effectiveness of ML-assisted care in arthritis.
. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 5, 2020.    . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review)
The copyright holder for this preprint this version posted September 5, 2020. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted September 5, 2020. . https://doi.org/10.1101/2020.09.03.20168609 doi: medRxiv preprint PATIENT AND PUBLIC INVOLVEMENT: Patients will be informed via different channels (e.g. the SCQM website) about the advances of deep learning in disease prediction gained in this study.
. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted September 5, 2020. . https://doi.org/10.1101/2020.09.03.20168609 doi: medRxiv preprint