- Open Access
Prediction of long-term mortality by using machine learning models in Chinese patients with connective tissue disease-associated interstitial lung disease
Respiratory Research volume 23, Article number: 4 (2022)
The exact risk assessment is crucial for the management of connective tissue disease-associated interstitial lung disease (CTD-ILD) patients. In the present study, we develop a nomogram to predict 3‑ and 5-year mortality by using machine learning approach and test the ILD-GAP model in Chinese CTD-ILD patients.
CTD-ILD patients who were diagnosed and treated at the First Affiliated Hospital of Zhengzhou University were enrolled based on a prior well-designed criterion between February 2011 and July 2018. Cox regression with the least absolute shrinkage and selection operator (LASSO) was used to screen out the predictors and generate a nomogram. Internal validation was performed using bootstrap resampling. Then, the nomogram and ILD-GAP model were assessed via likelihood ratio testing, Harrell’s C index, integrated discrimination improvement (IDI), the net reclassification improvement (NRI) and decision curve analysis.
A total of 675 consecutive CTD-ILD patients were enrolled in this study, during the median follow-up period of 50 (interquartile range, 38–65) months, 158 patients died (mortality rate 23.4%). After feature selection, 9 variables were identified: age, rheumatoid arthritis, lung diffusing capacity for carbon monoxide, right ventricular diameter, right atrial area, honeycombing, immunosuppressive agents, aspartate transaminase and albumin. A predictive nomogram was generated by integrating these variables, which provided better mortality estimates than ILD-GAP model based on the likelihood ratio testing, Harrell’s C index (0.767 and 0.652 respectively) and calibration plots. Application of the nomogram resulted in an improved IDI (3- and 5-year, 0.137 and 0.136 respectively) and NRI (3- and 5-year, 0.294 and 0.325 respectively) compared with ILD-GAP model. In addition, the nomogram was more clinically useful revealed by decision curve analysis.
The results from our study prove that the ILD-GAP model may exhibit an inapplicable role in predicting mortality risk in Chinese CTD-ILD patients. The nomogram we developed performed well in predicting 3‑ and 5-year mortality risk of Chinese CTD-ILD patients, but further studies and external validation will be required to determine the clinical usefulness of the nomogram.
Connective tissue disease (CTD) which consists of many autoimmune mechanisms is characterized by self-directed inflammation often leading to collagen deposition, tissue damage and ultimately target organs failure . CTD could involve multiple organs and systems, among which interstitial lung disease (ILD) remains a main cause of morbidity and mortality . The median survival time for patients with CTD-associated ILD (CTD-ILD) was reported to be around 6.5 years, and up to 12.4% of patients with CTD-ILD die of ILD [3, 4]. Thus, the exact risk assessment is crucial for the management of CTD-ILD patients.
The risk prediction of CTD-ILD remains challenging, due to the heterogeneity in patient-specific and disease-specific variables. The ILD-gender-age-physiology (ILD-GAP) model is a multidimensional mortality risk prediction model composed by the ILD diagnosis, sex, age, the percent predicted values of forced vital capacity (FVC %Predicted) and the percent predicted values of diffusion capacity of lung for carbon monoxide (DLco %Predicted). Since the ILD-GAP model was firstly established by Christopher J. Ryerson et al. based on North America population, it was wildly used to predict mortality across all chronic ILD subtypes, including CTD-ILD . However, the ILD-GAP model has not been validated in Chinese CTD-ILD patients. Therefore, more inclusive studies are needed to validate and improve the prediction accuracy of the existing assessment model.
We performed this study to establish a comprehensive predictive nomogram by using machine learning algorithms, involving demographic characteristics, clincal features, echocardiography, laboratory testing as well as imageological examination. Furthermore, we also validated whether the combination of the nomogram and ILD-GAP model could generate a superior prognostic performance.
CTD-ILD patients who were diagnosed and treated at the First Affiliated Hospital of Zhengzhou University were enrolled based on a prior well-designed criterion between February 2011 and July 2018. The patients would be included if they met four of the following inclusion criteria: (1) Patients were diagnosed CTD-ILD recommendated by the American Rheumatism Association and the American College of Rheumatology [6,7,8,9,10,11,12], including polymyositis/dermatomyositis (PM/DM), systemic lupus erythematosus (SLE), systemic sclerosis (SSc), ankylosing spondylitis (AS), sjogren syndrome (SS), mixed connective tissue disease (MCTD), rheumatoid arthritis (RA), undifferentiated connective tissue disease (UCTD) and overlap syndromes (OCTD). UCTD patients should also followed the diagnostic criteria for UCTD-ILD established by the previous research ; (2) having clinical symptoms (dyspnea or cough); (3) having signs suggestive of ILD (endinspiratory bibasilar crepitations); (4) having radiographic signs (honeycombing, ground-glass opacities, nodular or reticulonodular) of ILD confirmed by high-resolution computed tomography (HRCT). The patients would be excluded if they met one of the following exclusion criteria: (1) Age younger than 18 years; (2) pregnancy; (3) lossing to follow-up; (4) incomplete clinical records. This study received the Institutional Review Board approval by the First Affiliated Hospital of Zhengzhou University (2019-KY-116).
Demographic variables were extraction from medical chart review, including age, sex, occupation, smoking history, days of symptoms, medication treatment history, chronic disease history (diabetes and hypertension), CTD types, PFTs, echocardiography, laboratory data (routine inflammatory, hematological and biochemical parameters) and chest HRCT.
The collected PTFs data included FVC %Predicted, the percent predicted values of forced expiratory volume in one second (FEV1%Predicted), FEV1/FVC and DLco %Predicted.
The collected echocardiography data included right ventricular diameter (RVD), right atrial area (RAA), left ventricular diameter, aortic annulus diameter, left atrial diameter, ascending aortic diameter, pulmonary artery diameter, pulmonary artery systolic pressure (PASP), aortic valve regurgitation peak velocity, tricuspid regurgitant peak velocity and left ventricular ejection fraction (LVEF).
The collected laboratory data included klebs von den Lungen-6, procalcitonin, complement component C4, complement component C3, C-reactive protein (CRP), erythrocyte sedimentation rate, leukocyte count, platelet count, hemoglobin count, erythrocyte count, hematocrit, blood urea nitrogen, B-type natriuretic peptide (BNP), uric acid, creatinine, fasting blood glucose, aspartate transaminase (AST), alanine aminotransferase, γ-Glutamyltranspeptidase (GGT), alkaline phospatase, total protein, albumin (ALB), globulin, triglyceride, prothrombin time, cholesterol, activated partial thromboplastin time, prothrombin time activity, thrombin time, international normalized ratio, fibrinogen and D‐dimer.
HRCT images were reviewed independently by 2 expert thoracic radiologists, who were kept blinded for patients’ diagnosis. Images were re-evaluated till reaching a consensus when divergence occurred. The collected HRCT characteristics included honeycombing, ground-glass opacities, nodular, fine reticular opacities, local pleural thickening, pulmonary bullous, hydrothorax and hydropericardium.
Follow‑up and study outcome
All-cause mortality was the endpoint during follow-up until July 2021. Patients’ follow-up were performed by contacting with patients or their family through mobile phone.
Analyses were performed with the R programming language (R Core Team, online, 2021; version 4.1.0). Mean ± standard deviation (SD) was used to present continuous normal distributed variables, median (Interquartile Range, IQR) was used to present non-normal distributed parameters. The student t-test was applied to the comparison of normal distribution random variables. Wilcoxon signed-rank test was applied to comparison non-normal distribution variables. Besides, a Chi-square test and fisher exact test were employed for comparing categorical data. First, multiply-imputed by chained equations was conducted to impute covariates by using the “mice” package in R. Second, the method least absolute shrinkage and selection operator (LASSO) was done to avoid overfitting by using the “glmnet” package, and we tuned lambda (λ) by a tenfold cross-validation (CV) method by using the “cv.glmnet” function from the “glmnet” R package. Then, the Cox regression analysis was uesd to assess the significance of remained predicted factors in mortality by using the function “coxph” in the R package “survival”, and the prognostic nomogram was established by multivariable Cox regression coefficients based on package “rms”. Finally, the calibration plot of internal validation was conducted via a bootstrap method with 1000 resamples, by the “rms” R package, specifying the parameter “method = “boot”, B = 1000”, from the training set (n = 1000). The predicted performance of the established nomogram and the ILD-GAP model was compared with Harrell’s C index (“survival” R package), likelihood ratio testing (“lrtest” function in R package “lmtest”), a continuous version of the net reclassification improvement (NRI) and integrated discrimination improvement (IDI) (R package “survC1” and “survIDINRI”). Additionally, the decision curve analysis (DCA) was performed using the source file “stdca.R”. P -values (P) less than 0.050 were considered statistically significant.
The process of patient screening is illustrated in Fig. 1. After excluding the patients with younger than 18 years (n = 4), pregnancy (n = 2), much missing data (n = 5) and loss of follow-up (n = 43), a total of 675 patients eventually entered into the study. There were no significant deviations between the enrolled patients and patients were lost to follow-up in age, gender, occupation, smoking history, days of symptoms, medication treatment history, chronic disease history, pulmonary function test (PFTs) and HRCT (all P > 0.050). Therefore, excluding the patients with loss of follow-up may not affect the overall results in our study (Table 1).
In this study, the mean age of the cases was 54 ± 12 years (23.7% of male and 11.1% of ever smokers), and the median follow-up period was 50 months (interquartile range, 38–65). The disease subtypes comprise mainly polymyositis/dermatomyositis (29.8%), systemic lupus erythematosus (8.3%), systemic sclerosis (13.3%), ankylosing spondylitis (0.1%), sjogren syndrome (14.4%), mixed connective tissue disease (5.9%), rheumatoid arthritis (RA) (7.4%), undifferentiated connective tissue disease (15.3%) and overlap syndromes (5.5%). 158 patients died during the follow-up period, the 3- and 5-year mortality were 17.1% (95% confidence interval (CI) 14.2–19.8%) and 24.5% (95% CI 20.9–28.0%), respectively (Fig. 2). As compared with survival patients, deceased patients were significantly more likely to be older, males, ever smokers, farmers and treated without immunosuppressive drugs (all P < 0.050). Patients with RA had the highest mortality compared to the other CTD subtypes (P < 0.001). Deceased patients were also more likely to have lower the percent predicted values of diffusion capacity of lung for carbon monoxide (DLco %Predicted) and left ventricular ejection fraction (LVEF), lager right atrial area (RAA), and higher pulmonary artery systolic pressure (PASP) (all P < 0.050). In addition, when presenting with honeycombing, fine reticular opacities, local pleural thickening, pulmonary bullous, hydrothorax and hydropericardium on chest HRCT, most CTD-ILD patients are more exposed to the risk of dying (all P < 0.050) (Table 1).
A total of 74 prognostic indicators were included in this study. First, we reduced the dimension and picked the most meaningful prognostic indicators by LASSO Cox regression penalty. Subsequently, a tenfold cross-validation of the lasso model was performed for tuning parameter selection via the minimum criteria (Fig. 3A). The trajectory of each prognostic indicators coefficient was observed in the LASSO coefficient profiles with the changing of the log-transformed lambda in LASSO algorithm (Fig. 3B).
Finally, the optimal lambda value was 0.052 (log (lambda) was − 2.950) by using the LASSO algorithm and 14 variables were selected as potential prognosis-related indicators, including age, RA, Dlco %Predicted, right ventricular diameter (RVD), RAA, PASP, LVEF, honeycombing, C-reactive protein (CRP), B-type natriuretic peptide (BNP), aspartate transaminase (AST), γ-Glutamyltranspeptidase (GGT), albumin (ALB) and immunosuppressive agents. Univariable analysis showed that increased age (hazard ratio (HR) 1.041, 95% CI 1.027–1.055), RVD (HR 1.027, 95% CI 1.017–1.038), RAA (HR 1.122, 95% CI 1.093–1.151), PASP (HR 1.025, 95% CI 1.017–1.034), CRP (HR 1.005, 95% CI 1.002–1.008), BNP (HR 1.000, 95% CI 1.000–1.000), AST (HR 1.003, 95% CI 1.002–1.1005), GGT (HR 1.002, 95% CI 1.001–1.1003) and a lower DLCO %Predicted (HR 0.982, 95% CI 0.975–0.990), LVEF (HR 0.949, 95% CI 0.926–0.973), ALB levels (HR 0.936, 95% CI 0.914–0.959) correlated with increased mortality (all P < 0.001). Patients with RA (HR 2.292, 95% CI 1.539–3.413, P < 0.001) and honeycombing (HR 2.167, 95% CI 1.392–3.373, P = 0.001) also had higher mortality. In addition, mortality declined in those patients receiving immunosuppressive agents therapy (HR 0.506, 95% CI 0.367–0.697, P < 0.001) (Table 2). Significant variables (P value < 0.050) of the univariate analysis were entered into a multivariate Cox model, and showed that age, RA, Dlco %Predicted, RVD, RAA, honeycombinge, immunosuppressive agents, AST, ALB affected overall mortality significantly (all P < 0.050) (Table 2). According to multivariable Cox regression analysis, 9 independent variables were enrolled in nomogram for prognostic assessment (Fig. 4).
The ILD-GAP model exhibited increasing mortality rates in patents with higher scores by univariate variable Cox regression (HR 1.413, 95% CI 1.285–1.554, P < 0.001; Table 2). However, the ILD-GAP model did not perform well in predicting mortality (Harrell’s C index 0.652), and calibration plots showed that 3- and 5-year predicted survival rates were overestimated (Fig. 5A, B).
The nomogram exhibited a better prognostic performance (Harrell’s C index 0.767) compared with the ILD-GAP model, because likelihood-ratio test indicated that there was a statistically significant improvement after the inclusion of nomogram in the ILD-GAP model (P < 0.001), but no statistical difference after the inclusion of the ILD-GAP model in nomogram (P = 0.455) (Table 3). Calibration plots for nomogram predicted 3- and 5-year overall survival showed good agreement with actual observations (Fig. 5C, D). The nomogram also improved the ability of discriminate 3-year (0.137 and 0.294, IDI and NRI respectively, all P < 0.001) and 5-year (0.136 and 0.325, IDI and NRI respectively, all P < 0.001) mortality rates compared to ILD-GAP model (Table 4). To substantiate the utility of the both models, we performed decision curve analysis. For the optimal decision threshold > 0%, the nomogram showed a better net benefit than the ILD-GAP model for clinical intervention (Fig. 6A, B). In internal validation, the average Harrell’s C index for the prediction models developed in the bootstrap sample was 0.876, and the estimate of optimism was − 0.108.
The ILD-GAP model was derived and validated in a Western cohort but has not been validated in Chinese population to date, its ability to accurately define disease stage is partly debated [14,15,16]. In order to eliminate potently racial bias from the ILD-GAP model, we developed a nomogram for predicting 3‑ and 5-year mortality of Chinese CTD-ILD patients by using a machine learning approach and tested whether the combination of the nomogram and ILD-GAP model could generate a superior prognostic performance.
Multivariable analysis demonstrated that older age, RA, honeycombing, lower Dlco %Predicted and ALB, increased RVD, RAA and AST associated with higher mortality, but receiving immunosuppressive agents therapy correlated with reduced mortality. These independent risk factors can be supported by previous studies and theories. Age has been demonstrated to be an independent predictor of mortality in CTD-ILD by previous study, because older patients generally have more comorbidities and worse health status . Among ILD, presenting usual interstitial pneumonia (UIP) on chest HRCT has a poor response to corticosteroids and a worse prognosis than other subtypes [17, 18]. Honeycombing occurs in up to 90% of UIP cases, and it is the most specific finding of UIP on chest HRCT . Therefore, honeycombing is correlated with the prognosis of CTD-ILD patients to some extent. Gas exchange impairment is a common pathophysiological change at early stage of ILD, it typically presents as reduction of Dlco . Qiang Fu et al. reported that the percent predicted values of Dlco < 45% is a risk factor for CTD-ILD prognosis . A serum AST elevation and abnormal ALB can be caused by impaired heart, liver and kidney function due to CTD-ILD . Long-term monitoring of serum AST and ALB can be and early warning signal before organ dysfunction occurs. Furthermore, the abnormal increase in AST and hypoalbuminemia have been shown to increase mortality in CTD-ILD patients [23,24,25]. Long-term hypoxia caused by gas exchange impairment may lead to an increase in pulmonary artery pressure and right ventricular afterload . Right heart enlargement due to persistently increased afterload is a common cause of mortality in patients with ILD which is characterized by the increase of RVD and RAA . In addition, glucocorticoid and immunosuppressive therapy are essential choices for CTD-ILD patients, and mortality can be reduced by the appropriate use immunosuppressive agents [2, 28,29,30]. ILD can complicate RA and it is associated with an excess in mortality . Research has shown that nearly 10% of RA patient deaths were attributable to ILD. RA patients are more likely to die due to ILD compared to other CTD patients [2, 32].
We developed a nomogram by these independent mortality risk factors based on the multivariable analysis. In this nomogram, we assessed the association between predictor variables and time-to-event outcomes by LASSO-Cox method. Lasso is a machine learning algorithm that utilizes regularization to improve the estimation accuracy, it incorporates an L1-penalization term into the loss function forcing, which can shrink coefficients towards zero. Recently, LASSO-Cox method is popular by researchers, it could minimize overfitting and select predictors of nomogram .
In our cohort, the nomogram for Chinese CTD-ILD patients showed better discriminative ability, calibration and clinical net benefit compared with the ILD-GAP model. Despite the combination of the nomogram and ILD-GAP model was found to improve prognostic performance compared with the ILD-GAP model, it could not improve prognostic performance compared with the nomogram. Specifically, Harrell’s C index and calibration curve of the nomogram showed a good concordance for prediction and actual mortality risk. The nomogram also improved the ability of discriminating mortality compared to ILD-GAP model confirmed by integrated discrimination improvement and net reclassification improvement. For decision threshold > 0%, the nomogram showed a higher net benefit than the ILD-GAP model for clinical intervention in decision curve analysis. There are two results might explain why the ILD-GAP model is inferior to the nomogram in predicting prognosis of Chinese CTD-ILD patients. First, the ILD-GAP model was derived and validated in a Western cohort, there was no Chinese population involved. Thus, the risk of bias incurred from ethnic differences should also be considered. Second, the GAP risk prediction model was specifically developed for idiopathic pulmonary fibrosis (IPF) patients to prognosis prediction, from which the ILD-GAP model derived. However, the median survival time of IPF was much shorter compared to CTD-ILD [5, 34]. It is undeniable that the ILD-GAP model can provide important value for the treatment of CTD-ILD patients. To achieve the better predictable results, complex model seems necessary [35,36,37]. The clinical indicators included in this nomogram were routine and easily acquired data for most hospital which makes it applicable for daily clinical use. We strongly believe that the nomogram could be widely clinical referenced after cross-sectional and longitudinal validation and improvement.
Our study featured some limitations. First, the nomogram was not subjected to external validation, therefore caution is advised when employing it in a clinical framework. To the best of our knowledge, this is the first predictive model developed for predicting all-cause mortality of the Chinese population with CTD-ILD, we believe that an early report is urgent to provide a basis for future studies. Second, the disease categories were included as predictors in the nomogram instead of the serologic autoantibodies, because of the risk of collinearity. Third, The median survival time for CTD-ILD patients was reported to be around 6.5 years, but the median follow-up period was 50 (interquartile range, 38–65) months in our cohort. However, our study had a greater sample size and longer follow-up period than most of previous studies. Fourth, the nomogram and the ILD-GAP model were established by baseline characteristics, and longitudinal disease activity was not considered. Thus, omitted risk-associated trajectories of disease would likely have led to an underestimate of the true relation between CTD-ILD and mortality by the two above-mentioned models.
In conclusion, the ILD-GAP model performed poorly in predicted mortality of the Chinese patients with CTD-ILD. Our study developed a nomogram for predicting 3‑ and 5-year mortality of Chinese CTD-ILD patients by using a machine learning approach and performed well in predicting mortality risk.
Availability of data and materials
The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.
Connective tissue disease
Interstitial lung disease
Connective tissue disease-associated interstitial lung disease
Pulmonary function test
Systemic lupus erythematosus
Mixed connective tissue disease
Undifferentiated connective tissue disease
High-resolution computed tomography
- FVC %Predicted:
Percent predicted values of forced vital capacity
Percent predicted values of forced expiratory volume in one second
- DLco %Predicted:
Percent predicted values of diffusion capacity of lung for carbon monoxide
Right ventricular diameter
Right atrial area
Pulmonary artery systolic pressure
Left ventricular ejection fraction
B-type natriuretic peptide
Least absolute shrinkage and selection operator
Net reclassification improvement
Integrated discrimination improvement
Decision curve analysis
Spagnolo P, Distler O, Ryerson CJ, Tzouvelekis A, Lee JS, Bonella F, et al. Mechanisms of progressive fibrosis in connective tissue disease (CTD)-associated interstitial lung diseases (ILDs). Ann Rheum Dis. 2021;80(2):143–50.
Mathai SC, Danoff SK. Management of interstitial lung disease associated with connective tissue disease. BMJ (Clinical research ed). 2016;352:h6819.
Suzuki A, Kondoh Y, Fischer A. Recent advances in connective tissue disease related interstitial lung disease. Expert Rev Respir Med. 2017;11(7):591–603.
Demoruelle MK, Mittoo S, Solomon JJ. Connective tissue disease-related interstitial lung disease. Best Pract Res Clin Rheumatol. 2016;30(1):39–52.
Ryerson CJ, Vittinghoff E, Ley B, Lee JS, Mooney JJ, Jones KD, et al. Predicting survival across chronic interstitial lung disease: the ILD-GAP model. Chest. 2014;145(4):723–8.
McVeigh CM, Cairns AP. Diagnosis and management of ankylosing spondylitis. BMJ (Clinical research ed). 2006;333(7568):581–5.
Sharp GC, Irvin WS, Tan EM, Gould RG, Holman HR. Mixed connective tissue disease–an apparently distinct rheumatic disease syndrome associated with a specific antibody to an extractable nuclear antigen (ENA). Am J Med. 1972;52(2):148–59.
Vitali C, Bombardieri S, Moutsopoulos HM, Coll J, Gerli R, Hatron PY, et al. Assessment of the European classification criteria for Sjogren’s syndrome in a series of clinically defined cases: results of a prospective multicentre study. The European Study Group on Diagnostic Criteria for Sjogren’s Syndrome. Ann Rheum Dis. 1996;55(2):116–21.
Wolf L, Sheahan M, McCormick J, Michel B, Moskowitz RW. Classification criteria for systemic lupus erythematosus. Frequency in normal patients. JAMA. 1976;236(13):1497–9.
Bohan A, Peter JB. Polymyositis and dermatomyositis (first of two parts). N Engl J Med. 1975;292(7):344–7.
Preliminary criteria for the classification of systemic sclerosis (scleroderma). Subcommittee for scleroderma criteria of the American Rheumatism Association Diagnostic and Therapeutic Criteria Committee. Arthritis Rheum. 1980;23(5):581–90.
Aletaha D, Neogi T, Silman AJ, Funovits J, Felson DT, Bingham CO 3rd, et al. 2010 Rheumatoid arthritis classification criteria: an American College of Rheumatology/European League Against Rheumatism collaborative initiative. Arthritis Rheum. 2010;62(9):2569–81.
Hu Y, Wang LS, Wei YR, Du SS, Du YK, He X, et al. Clinical characteristics of connective tissue disease-associated interstitial lung disease in 1,044 Chinese patients. Chest. 2016;149(1):201–8.
Brusca RM, Pinal-Fernandez I, Psoter K, Paik JJ, Albayda J, Mecoli C, et al. The ILD-GAP risk prediction model performs poorly in myositis-associated interstitial lung disease. Respir Med. 2019;150:63–5.
Mango RL, Matteson EL, Crowson CS, Ryu JH, Makol A. Assessing mortality models in systemic sclerosis-related interstitial lung disease. Lung. 2018;196(4):409–16.
Kam MLW, Li HH, Tan YH, Low SY. Validation of the ILD-GAP model and a local nomogram in a singaporean cohort. Respiration. 2019;98(5):383–90.
Yunt ZX, Chung JH, Hobbs S, Fernandez-Perez ER, Olson AL, Huie TJ, et al. High resolution computed tomography pattern of usual interstitial pneumonia in rheumatoid arthritis-associated interstitial lung disease: relationship to survival. Respir Med. 2017;126:100–4.
Kim EJ, Elicker BM, Maldonado F, Webb WR, Ryu JH, Van Uden JH, et al. Usual interstitial pneumonia in rheumatoid arthritis-associated interstitial lung disease. Eur Respir J. 2010;35(6):1322–8.
Chung JH, Chawla A, Peljto AL, Cool CD, Groshong SD, Talbert JL, et al. CT scan findings of probable usual interstitial pneumonitis have a high predictive value for histologic usual interstitial pneumonitis. Chest. 2015;147(2):450–9.
Kelly CA, Saravanan V, Nisar M, Arthanari S, Woodhead FA, Price-Forbes AN, et al. Rheumatoid arthritis-related interstitial lung disease: associations, prognostic factors and physiological and radiological characteristics–a large multicentre UK study. Rheumatology (Oxford). 2014;53(9):1676–82.
Fu Q, Wang L, Li L, Li Y, Liu R, Zheng Y. Risk factors for progression and prognosis of rheumatoid arthritis-associated interstitial lung disease: single center study with a large sample of Chinese population. Clin Rheumatol. 2019;38(4):1109–16.
Hull RP, Goldsmith DJ. Nephrotic syndrome in adults. BMJ (Clinical research ed). 2008;336(7654):1185–9.
Lawrence YA, Steiner JM. Laboratory evaluation of the liver. Vet Clin North Am Small Anim Pract. 2017;47(3):539–53.
Li R, Zhu WJ, Wang F, Tang X, Luo F. AST/ALT ratio as a predictor of mortality and exacerbations of PM/DM-ILD in 1 year-a retrospective cohort study with 522 cases. Arthritis Res Ther. 2020;22(1):202.
Akirov A, Masri-Iraqi H, Atamna A, Shimon I. Low albumin levels are associated with mortality risk in hospitalized patients. Am J Med. 2017;130(12):1465.
Grimminger J, Ghofrani HA, Weissmann N, Klose H, Grimminger F. COPD-associated pulmonary hypertension: clinical implications and current methods for treatment. Expert Rev Respir Med. 2016;10(7):755–66.
Wang Z, Chesler NC. Pulmonary vascular mechanics: important contributors to the increased right ventricular afterload of pulmonary hypertension. Exp Physiol. 2013;98(8):1267–73.
Castelino FV, Varga J. Interstitial lung disease in connective tissue diseases: evolving concepts of pathogenesis and management. Arthritis Res Ther. 2010;12(4):213.
Vij R, Strek ME. Diagnosis and treatment of connective tissue disease-associated interstitial lung disease. Chest. 2013;143(3):814–24.
Witt LJ, Demchuk C, Curran JJ, Strek ME. Benefit of adjunctive tacrolimus in connective tissue disease-interstitial lung disease. Pulm Pharmacol Ther. 2016;36:46–52.
Young A, Koduri G, Batley M, Kulinskaya E, Gough A, Norton S, et al. Mortality in rheumatoid arthritis. Increased in the early course of disease, in ischaemic heart disease and in pulmonary fibrosis. Rheumatology. 2007;46(2):350–7.
Olson AL, Swigris JJ, Sprunger DB, Fischer A, Fernandez-Perez ER, Solomon J, et al. Rheumatoid arthritis-interstitial lung disease-associated mortality. Am J Respir Crit Care Med. 2011;183(3):372–8.
Tibshirani R. The lasso method for variable selection in the Cox model. Stat Med. 1997;16(4):385–95.
Ley B, Ryerson CJ, Vittinghoff E, Ryu JH, Tomassetti S, Lee JS, et al. A multidimensional index and staging system for idiopathic pulmonary fibrosis. Ann Intern Med. 2012;156(10):684–91.
Jones PW, Quirk FH, Baveystock CM. The St George’s respiratory questionnaire. Respir Med. 1991;85(Suppl B):25–31.
Schurink CAM, Nieuwenhoven CAV, Jacobs JA, Rozenberg-Arska M, Joore HCA, Buskens E, et al. Clinical pulmonary infection score for ventilator-associated pneumonia: accuracy and inter-observer variability. Intensive Care Med. 2004;30(2):217–24.
Valencia M, Badia JR, Cavalcanti M, Ferrer M, Agusti C, Angrill J, et al. Pneumonia severity index class v patients with community-acquired pneumonia: characteristics, outcomes, and value of severity scores. Chest. 2007;132(2):515–22.
This study was supported by National Natural Science Foundation of China (U1904142, 82000015), Scientific and technological projects of Science and Technology Department of Henan Province (182102410010), Key Scientific Research Project of Colleges and Universities in Henan Province (18A320056).
Ethics approval and consent to participate
Ethical approval for this study was obtained from the Institution Review Board of the First Affiliated Hospital of Zhengzhou University (2019-KY-116).
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Sun, D., Wang, Y., Liu, Q. et al. Prediction of long-term mortality by using machine learning models in Chinese patients with connective tissue disease-associated interstitial lung disease. Respir Res 23, 4 (2022). https://doi.org/10.1186/s12931-022-01925-x
- Interstitial lung disease
- Connective tissue disease
- Machine learning