Prediction of long-term mortality by using machine learning models in Chinese patients with connective tissue disease-associated interstitial lung disease

Sun, Di; Wang, Yu; Liu, Qing; Wang, Tingting; Li, Pengfei; Jiang, Tianci; Dai, Lingling; Jia, Liuqun; Zhao, Wenjing; Cheng, Zhe

doi:10.1186/s12931-022-01925-x

Research
Open access
Published: 07 January 2022

Prediction of long-term mortality by using machine learning models in Chinese patients with connective tissue disease-associated interstitial lung disease

Di Sun¹,
Yu Wang¹,
Qing Liu¹,
Tingting Wang¹,
Pengfei Li¹,
Tianci Jiang¹,
Lingling Dai¹,
Liuqun Jia¹,
Wenjing Zhao¹ &
…
Zhe Cheng ORCID: orcid.org/0000-0002-0894-2444¹

Respiratory Research volume 23, Article number: 4 (2022) Cite this article

3912 Accesses
9 Citations
1 Altmetric
Metrics details

Abstract

Background

The exact risk assessment is crucial for the management of connective tissue disease-associated interstitial lung disease (CTD-ILD) patients. In the present study, we develop a nomogram to predict 3‑ and 5-year mortality by using machine learning approach and test the ILD-GAP model in Chinese CTD-ILD patients.

Methods

CTD-ILD patients who were diagnosed and treated at the First Affiliated Hospital of Zhengzhou University were enrolled based on a prior well-designed criterion between February 2011 and July 2018. Cox regression with the least absolute shrinkage and selection operator (LASSO) was used to screen out the predictors and generate a nomogram. Internal validation was performed using bootstrap resampling. Then, the nomogram and ILD-GAP model were assessed via likelihood ratio testing, Harrell’s C index, integrated discrimination improvement (IDI), the net reclassification improvement (NRI) and decision curve analysis.

Results

A total of 675 consecutive CTD-ILD patients were enrolled in this study, during the median follow-up period of 50 (interquartile range, 38–65) months, 158 patients died (mortality rate 23.4%). After feature selection, 9 variables were identified: age, rheumatoid arthritis, lung diffusing capacity for carbon monoxide, right ventricular diameter, right atrial area, honeycombing, immunosuppressive agents, aspartate transaminase and albumin. A predictive nomogram was generated by integrating these variables, which provided better mortality estimates than ILD-GAP model based on the likelihood ratio testing, Harrell’s C index (0.767 and 0.652 respectively) and calibration plots. Application of the nomogram resulted in an improved IDI (3- and 5-year, 0.137 and 0.136 respectively) and NRI (3- and 5-year, 0.294 and 0.325 respectively) compared with ILD-GAP model. In addition, the nomogram was more clinically useful revealed by decision curve analysis.

Conclusions

The results from our study prove that the ILD-GAP model may exhibit an inapplicable role in predicting mortality risk in Chinese CTD-ILD patients. The nomogram we developed performed well in predicting 3‑ and 5-year mortality risk of Chinese CTD-ILD patients, but further studies and external validation will be required to determine the clinical usefulness of the nomogram.

Background

Connective tissue disease (CTD) which consists of many autoimmune mechanisms is characterized by self-directed inflammation often leading to collagen deposition, tissue damage and ultimately target organs failure [1]. CTD could involve multiple organs and systems, among which interstitial lung disease (ILD) remains a main cause of morbidity and mortality [2]. The median survival time for patients with CTD-associated ILD (CTD-ILD) was reported to be around 6.5 years, and up to 12.4% of patients with CTD-ILD die of ILD [3, 4]. Thus, the exact risk assessment is crucial for the management of CTD-ILD patients.

The risk prediction of CTD-ILD remains challenging, due to the heterogeneity in patient-specific and disease-specific variables. The ILD-gender-age-physiology (ILD-GAP) model is a multidimensional mortality risk prediction model composed by the ILD diagnosis, sex, age, the percent predicted values of forced vital capacity (FVC %Predicted) and the percent predicted values of diffusion capacity of lung for carbon monoxide (DLco %Predicted). Since the ILD-GAP model was firstly established by Christopher J. Ryerson et al. based on North America population, it was wildly used to predict mortality across all chronic ILD subtypes, including CTD-ILD [5]. However, the ILD-GAP model has not been validated in Chinese CTD-ILD patients. Therefore, more inclusive studies are needed to validate and improve the prediction accuracy of the existing assessment model.

We performed this study to establish a comprehensive predictive nomogram by using machine learning algorithms, involving demographic characteristics, clincal features, echocardiography, laboratory testing as well as imageological examination. Furthermore, we also validated whether the combination of the nomogram and ILD-GAP model could generate a superior prognostic performance.

Methods

Patients

CTD-ILD patients who were diagnosed and treated at the First Affiliated Hospital of Zhengzhou University were enrolled based on a prior well-designed criterion between February 2011 and July 2018. The patients would be included if they met four of the following inclusion criteria: (1) Patients were diagnosed CTD-ILD recommendated by the American Rheumatism Association and the American College of Rheumatology [6,7,8,9,10,11,12], including polymyositis/dermatomyositis (PM/DM), systemic lupus erythematosus (SLE), systemic sclerosis (SSc), ankylosing spondylitis (AS), sjogren syndrome (SS), mixed connective tissue disease (MCTD), rheumatoid arthritis (RA), undifferentiated connective tissue disease (UCTD) and overlap syndromes (OCTD). UCTD patients should also followed the diagnostic criteria for UCTD-ILD established by the previous research [13]; (2) having clinical symptoms (dyspnea or cough); (3) having signs suggestive of ILD (endinspiratory bibasilar crepitations); (4) having radiographic signs (honeycombing, ground-glass opacities, nodular or reticulonodular) of ILD confirmed by high-resolution computed tomography (HRCT). The patients would be excluded if they met one of the following exclusion criteria: (1) Age younger than 18 years; (2) pregnancy; (3) lossing to follow-up; (4) incomplete clinical records. This study received the Institutional Review Board approval by the First Affiliated Hospital of Zhengzhou University (2019-KY-116).

Data collection

Demographic variables were extraction from medical chart review, including age, sex, occupation, smoking history, days of symptoms, medication treatment history, chronic disease history (diabetes and hypertension), CTD types, PFTs, echocardiography, laboratory data (routine inflammatory, hematological and biochemical parameters) and chest HRCT.

The collected PTFs data included FVC %Predicted, the percent predicted values of forced expiratory volume in one second (FEV₁%Predicted), FEV₁/FVC and DLco %Predicted.

The collected echocardiography data included right ventricular diameter (RVD), right atrial area (RAA), left ventricular diameter, aortic annulus diameter, left atrial diameter, ascending aortic diameter, pulmonary artery diameter, pulmonary artery systolic pressure (PASP), aortic valve regurgitation peak velocity, tricuspid regurgitant peak velocity and left ventricular ejection fraction (LVEF).

The collected laboratory data included klebs von den Lungen-6, procalcitonin, complement component C4, complement component C3, C-reactive protein (CRP), erythrocyte sedimentation rate, leukocyte count, platelet count, hemoglobin count, erythrocyte count, hematocrit, blood urea nitrogen, B-type natriuretic peptide (BNP), uric acid, creatinine, fasting blood glucose, aspartate transaminase (AST), alanine aminotransferase, γ-Glutamyltranspeptidase (GGT), alkaline phospatase, total protein, albumin (ALB), globulin, triglyceride, prothrombin time, cholesterol, activated partial thromboplastin time, prothrombin time activity, thrombin time, international normalized ratio, fibrinogen and D‐dimer.

HRCT images were reviewed independently by 2 expert thoracic radiologists, who were kept blinded for patients’ diagnosis. Images were re-evaluated till reaching a consensus when divergence occurred. The collected HRCT characteristics included honeycombing, ground-glass opacities, nodular, fine reticular opacities, local pleural thickening, pulmonary bullous, hydrothorax and hydropericardium.

Follow‑up and study outcome

All-cause mortality was the endpoint during follow-up until July 2021. Patients’ follow-up were performed by contacting with patients or their family through mobile phone.

Statistical analyses

Analyses were performed with the R programming language (R Core Team, online, 2021; version 4.1.0). Mean ± standard deviation (SD) was used to present continuous normal distributed variables, median (Interquartile Range, IQR) was used to present non-normal distributed parameters. The student t-test was applied to the comparison of normal distribution random variables. Wilcoxon signed-rank test was applied to comparison non-normal distribution variables. Besides, a Chi-square test and fisher exact test were employed for comparing categorical data. First, multiply-imputed by chained equations was conducted to impute covariates by using the “mice” package in R. Second, the method least absolute shrinkage and selection operator (LASSO) was done to avoid overfitting by using the “glmnet” package, and we tuned lambda (λ) by a tenfold cross-validation (CV) method by using the “cv.glmnet” function from the “glmnet” R package. Then, the Cox regression analysis was uesd to assess the significance of remained predicted factors in mortality by using the function “coxph” in the R package “survival”, and the prognostic nomogram was established by multivariable Cox regression coefficients based on package “rms”. Finally, the calibration plot of internal validation was conducted via a bootstrap method with 1000 resamples, by the “rms” R package, specifying the parameter “method = “boot”, B = 1000”, from the training set (n = 1000). The predicted performance of the established nomogram and the ILD-GAP model was compared with Harrell’s C index (“survival” R package), likelihood ratio testing (“lrtest” function in R package “lmtest”), a continuous version of the net reclassification improvement (NRI) and integrated discrimination improvement (IDI) (R package “survC1” and “survIDINRI”). Additionally, the decision curve analysis (DCA) was performed using the source file “stdca.R”. P -values (P) less than 0.050 were considered statistically significant.

Results

Patient characteristics

The process of patient screening is illustrated in Fig. 1. After excluding the patients with younger than 18 years (n = 4), pregnancy (n = 2), much missing data (n = 5) and loss of follow-up (n = 43), a total of 675 patients eventually entered into the study. There were no significant deviations between the enrolled patients and patients were lost to follow-up in age, gender, occupation, smoking history, days of symptoms, medication treatment history, chronic disease history, pulmonary function test (PFTs) and HRCT (all P > 0.050). Therefore, excluding the patients with loss of follow-up may not affect the overall results in our study (Table 1).

Table 1 Clinical characteristics of CTD-ILD patients

Full size table

In this study, the mean age of the cases was 54 ± 12 years (23.7% of male and 11.1% of ever smokers), and the median follow-up period was 50 months (interquartile range, 38–65). The disease subtypes comprise mainly polymyositis/dermatomyositis (29.8%), systemic lupus erythematosus (8.3%), systemic sclerosis (13.3%), ankylosing spondylitis (0.1%), sjogren syndrome (14.4%), mixed connective tissue disease (5.9%), rheumatoid arthritis (RA) (7.4%), undifferentiated connective tissue disease (15.3%) and overlap syndromes (5.5%). 158 patients died during the follow-up period, the 3- and 5-year mortality were 17.1% (95% confidence interval (CI) 14.2–19.8%) and 24.5% (95% CI 20.9–28.0%), respectively (Fig. 2). As compared with survival patients, deceased patients were significantly more likely to be older, males, ever smokers, farmers and treated without immunosuppressive drugs (all P < 0.050). Patients with RA had the highest mortality compared to the other CTD subtypes (P < 0.001). Deceased patients were also more likely to have lower the percent predicted values of diffusion capacity of lung for carbon monoxide (DLco %Predicted) and left ventricular ejection fraction (LVEF), lager right atrial area (RAA), and higher pulmonary artery systolic pressure (PASP) (all P < 0.050). In addition, when presenting with honeycombing, fine reticular opacities, local pleural thickening, pulmonary bullous, hydrothorax and hydropericardium on chest HRCT, most CTD-ILD patients are more exposed to the risk of dying (all P < 0.050) (Table 1).

Model derivation

A total of 74 prognostic indicators were included in this study. First, we reduced the dimension and picked the most meaningful prognostic indicators by LASSO Cox regression penalty. Subsequently, a tenfold cross-validation of the lasso model was performed for tuning parameter selection via the minimum criteria (Fig. 3A). The trajectory of each prognostic indicators coefficient was observed in the LASSO coefficient profiles with the changing of the log-transformed lambda in LASSO algorithm (Fig. 3B).

Finally, the optimal lambda value was 0.052 (log (lambda) was − 2.950) by using the LASSO algorithm and 14 variables were selected as potential prognosis-related indicators, including age, RA, Dlco %Predicted, right ventricular diameter (RVD), RAA, PASP, LVEF, honeycombing, C-reactive protein (CRP), B-type natriuretic peptide (BNP), aspartate transaminase (AST), γ-Glutamyltranspeptidase (GGT), albumin (ALB) and immunosuppressive agents. Univariable analysis showed that increased age (hazard ratio (HR) 1.041, 95% CI 1.027–1.055), RVD (HR 1.027, 95% CI 1.017–1.038), RAA (HR 1.122, 95% CI 1.093–1.151), PASP (HR 1.025, 95% CI 1.017–1.034), CRP (HR 1.005, 95% CI 1.002–1.008), BNP (HR 1.000, 95% CI 1.000–1.000), AST (HR 1.003, 95% CI 1.002–1.1005), GGT (HR 1.002, 95% CI 1.001–1.1003) and a lower DLCO %Predicted (HR 0.982, 95% CI 0.975–0.990), LVEF (HR 0.949, 95% CI 0.926–0.973), ALB levels (HR 0.936, 95% CI 0.914–0.959) correlated with increased mortality (all P < 0.001). Patients with RA (HR 2.292, 95% CI 1.539–3.413, P < 0.001) and honeycombing (HR 2.167, 95% CI 1.392–3.373, P = 0.001) also had higher mortality. In addition, mortality declined in those patients receiving immunosuppressive agents therapy (HR 0.506, 95% CI 0.367–0.697, P < 0.001) (Table 2). Significant variables (P value < 0.050) of the univariate analysis were entered into a multivariate Cox model, and showed that age, RA, Dlco %Predicted, RVD, RAA, honeycombinge, immunosuppressive agents, AST, ALB affected overall mortality significantly (all P < 0.050) (Table 2). According to multivariable Cox regression analysis, 9 independent variables were enrolled in nomogram for prognostic assessment (Fig. 4).

Table 2 Risk factors for all-cause mortality in CTD-ILD

Full size table

Model validation

The ILD-GAP model exhibited increasing mortality rates in patents with higher scores by univariate variable Cox regression (HR 1.413, 95% CI 1.285–1.554, P < 0.001; Table 2). However, the ILD-GAP model did not perform well in predicting mortality (Harrell’s C index 0.652), and calibration plots showed that 3- and 5-year predicted survival rates were overestimated (Fig. 5A, B).

The nomogram exhibited a better prognostic performance (Harrell’s C index 0.767) compared with the ILD-GAP model, because likelihood-ratio test indicated that there was a statistically significant improvement after the inclusion of nomogram in the ILD-GAP model (P < 0.001), but no statistical difference after the inclusion of the ILD-GAP model in nomogram (P = 0.455) (Table 3). Calibration plots for nomogram predicted 3- and 5-year overall survival showed good agreement with actual observations (Fig. 5C, D). The nomogram also improved the ability of discriminate 3-year (0.137 and 0.294, IDI and NRI respectively, all P < 0.001) and 5-year (0.136 and 0.325, IDI and NRI respectively, all P < 0.001) mortality rates compared to ILD-GAP model (Table 4). To substantiate the utility of the both models, we performed decision curve analysis. For the optimal decision threshold > 0%, the nomogram showed a better net benefit than the ILD-GAP model for clinical intervention (Fig. 6A, B). In internal validation, the average Harrell’s C index for the prediction models developed in the bootstrap sample was 0.876, and the estimate of optimism was − 0.108.

Table 3 Comparison of nomogram and the ILD-GAP model

Full size table

Table 4 Prediction improvement with nomogram compared to ILD-GAP model

Full size table

Discussion

The ILD-GAP model was derived and validated in a Western cohort but has not been validated in Chinese population to date, its ability to accurately define disease stage is partly debated [14,15,16]. In order to eliminate potently racial bias from the ILD-GAP model, we developed a nomogram for predicting 3‑ and 5-year mortality of Chinese CTD-ILD patients by using a machine learning approach and tested whether the combination of the nomogram and ILD-GAP model could generate a superior prognostic performance.

Multivariable analysis demonstrated that older age, RA, honeycombing, lower Dlco %Predicted and ALB, increased RVD, RAA and AST associated with higher mortality, but receiving immunosuppressive agents therapy correlated with reduced mortality. These independent risk factors can be supported by previous studies and theories. Age has been demonstrated to be an independent predictor of mortality in CTD-ILD by previous study, because older patients generally have more comorbidities and worse health status [16]. Among ILD, presenting usual interstitial pneumonia (UIP) on chest HRCT has a poor response to corticosteroids and a worse prognosis than other subtypes [17, 18]. Honeycombing occurs in up to 90% of UIP cases, and it is the most specific finding of UIP on chest HRCT [19]. Therefore, honeycombing is correlated with the prognosis of CTD-ILD patients to some extent. Gas exchange impairment is a common pathophysiological change at early stage of ILD, it typically presents as reduction of Dlco [20]. Qiang Fu et al. reported that the percent predicted values of Dlco < 45% is a risk factor for CTD-ILD prognosis [21]. A serum AST elevation and abnormal ALB can be caused by impaired heart, liver and kidney function due to CTD-ILD [22]. Long-term monitoring of serum AST and ALB can be and early warning signal before organ dysfunction occurs. Furthermore, the abnormal increase in AST and hypoalbuminemia have been shown to increase mortality in CTD-ILD patients [23,24,25]. Long-term hypoxia caused by gas exchange impairment may lead to an increase in pulmonary artery pressure and right ventricular afterload [26]. Right heart enlargement due to persistently increased afterload is a common cause of mortality in patients with ILD which is characterized by the increase of RVD and RAA [27]. In addition, glucocorticoid and immunosuppressive therapy are essential choices for CTD-ILD patients, and mortality can be reduced by the appropriate use immunosuppressive agents [2, 28,29,30]. ILD can complicate RA and it is associated with an excess in mortality [31]. Research has shown that nearly 10% of RA patient deaths were attributable to ILD. RA patients are more likely to die due to ILD compared to other CTD patients [2, 32].

We developed a nomogram by these independent mortality risk factors based on the multivariable analysis. In this nomogram, we assessed the association between predictor variables and time-to-event outcomes by LASSO-Cox method. Lasso is a machine learning algorithm that utilizes regularization to improve the estimation accuracy, it incorporates an L1-penalization term into the loss function forcing, which can shrink coefficients towards zero. Recently, LASSO-Cox method is popular by researchers, it could minimize overfitting and select predictors of nomogram [33].

In our cohort, the nomogram for Chinese CTD-ILD patients showed better discriminative ability, calibration and clinical net benefit compared with the ILD-GAP model. Despite the combination of the nomogram and ILD-GAP model was found to improve prognostic performance compared with the ILD-GAP model, it could not improve prognostic performance compared with the nomogram. Specifically, Harrell’s C index and calibration curve of the nomogram showed a good concordance for prediction and actual mortality risk. The nomogram also improved the ability of discriminating mortality compared to ILD-GAP model confirmed by integrated discrimination improvement and net reclassification improvement. For decision threshold > 0%, the nomogram showed a higher net benefit than the ILD-GAP model for clinical intervention in decision curve analysis. There are two results might explain why the ILD-GAP model is inferior to the nomogram in predicting prognosis of Chinese CTD-ILD patients. First, the ILD-GAP model was derived and validated in a Western cohort, there was no Chinese population involved. Thus, the risk of bias incurred from ethnic differences should also be considered. Second, the GAP risk prediction model was specifically developed for idiopathic pulmonary fibrosis (IPF) patients to prognosis prediction, from which the ILD-GAP model derived. However, the median survival time of IPF was much shorter compared to CTD-ILD [5, 34]. It is undeniable that the ILD-GAP model can provide important value for the treatment of CTD-ILD patients. To achieve the better predictable results, complex model seems necessary [35,36,37]. The clinical indicators included in this nomogram were routine and easily acquired data for most hospital which makes it applicable for daily clinical use. We strongly believe that the nomogram could be widely clinical referenced after cross-sectional and longitudinal validation and improvement.

Our study featured some limitations. First, the nomogram was not subjected to external validation, therefore caution is advised when employing it in a clinical framework. To the best of our knowledge, this is the first predictive model developed for predicting all-cause mortality of the Chinese population with CTD-ILD, we believe that an early report is urgent to provide a basis for future studies. Second, the disease categories were included as predictors in the nomogram instead of the serologic autoantibodies, because of the risk of collinearity. Third, The median survival time for CTD-ILD patients was reported to be around 6.5 years, but the median follow-up period was 50 (interquartile range, 38–65) months in our cohort. However, our study had a greater sample size and longer follow-up period than most of previous studies. Fourth, the nomogram and the ILD-GAP model were established by baseline characteristics, and longitudinal disease activity was not considered. Thus, omitted risk-associated trajectories of disease would likely have led to an underestimate of the true relation between CTD-ILD and mortality by the two above-mentioned models.

Conclusions

In conclusion, the ILD-GAP model performed poorly in predicted mortality of the Chinese patients with CTD-ILD. Our study developed a nomogram for predicting 3‑ and 5-year mortality of Chinese CTD-ILD patients by using a machine learning approach and performed well in predicting mortality risk.

Availability of data and materials

The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

Abbreviations

CTD:: Connective tissue disease
ILD:: Interstitial lung disease
CTD-ILD:: Connective tissue disease-associated interstitial lung disease
PFTs:: Pulmonary function test
PM/DM:: Polymyositis/Dermatomyositis
SLE:: Systemic lupus erythematosus
SSc:: Systemic sclerosis
AS:: Ankylosing spondylitis
SS:: Sjogren syndrome
MCTD:: Mixed connective tissue disease
RA:: Rheumatoid arthritis
UCTD:: Undifferentiated connective tissue disease
OCTD:: Overlap syndromes
HRCT:: High-resolution computed tomography
FVC %Predicted:: Percent predicted values of forced vital capacity
FEV1%Predicted:: Percent predicted values of forced expiratory volume in one second
DLco %Predicted:: Percent predicted values of diffusion capacity of lung for carbon monoxide
RVD:: Right ventricular diameter
RAA:: Right atrial area
PASP:: Pulmonary artery systolic pressure
LVEF:: Left ventricular ejection fraction
BNP:: B-type natriuretic peptide
AST:: Aspartate transaminase
GGT:: γ-Glutamyltranspeptidase
ALB:: Albumin
SD:: Standard deviation
IQR:: Interquartile range
LASSO:: Least absolute shrinkage and selection operator
NRI:: Net reclassification improvement
IDI:: Integrated discrimination improvement
DCA:: Decision curve analysis

References

Spagnolo P, Distler O, Ryerson CJ, Tzouvelekis A, Lee JS, Bonella F, et al. Mechanisms of progressive fibrosis in connective tissue disease (CTD)-associated interstitial lung diseases (ILDs). Ann Rheum Dis. 2021;80(2):143–50.
Article CAS PubMed Google Scholar
Mathai SC, Danoff SK. Management of interstitial lung disease associated with connective tissue disease. BMJ (Clinical research ed). 2016;352:h6819.
PubMed Central Google Scholar
Suzuki A, Kondoh Y, Fischer A. Recent advances in connective tissue disease related interstitial lung disease. Expert Rev Respir Med. 2017;11(7):591–603.
Article CAS PubMed Google Scholar
Demoruelle MK, Mittoo S, Solomon JJ. Connective tissue disease-related interstitial lung disease. Best Pract Res Clin Rheumatol. 2016;30(1):39–52.
Article PubMed Google Scholar
Ryerson CJ, Vittinghoff E, Ley B, Lee JS, Mooney JJ, Jones KD, et al. Predicting survival across chronic interstitial lung disease: the ILD-GAP model. Chest. 2014;145(4):723–8.
Article PubMed Google Scholar
McVeigh CM, Cairns AP. Diagnosis and management of ankylosing spondylitis. BMJ (Clinical research ed). 2006;333(7568):581–5.
Article Google Scholar
Sharp GC, Irvin WS, Tan EM, Gould RG, Holman HR. Mixed connective tissue disease–an apparently distinct rheumatic disease syndrome associated with a specific antibody to an extractable nuclear antigen (ENA). Am J Med. 1972;52(2):148–59.
Article CAS PubMed Google Scholar
Vitali C, Bombardieri S, Moutsopoulos HM, Coll J, Gerli R, Hatron PY, et al. Assessment of the European classification criteria for Sjogren’s syndrome in a series of clinically defined cases: results of a prospective multicentre study. The European Study Group on Diagnostic Criteria for Sjogren’s Syndrome. Ann Rheum Dis. 1996;55(2):116–21.
Article CAS PubMed PubMed Central Google Scholar
Wolf L, Sheahan M, McCormick J, Michel B, Moskowitz RW. Classification criteria for systemic lupus erythematosus. Frequency in normal patients. JAMA. 1976;236(13):1497–9.
Article CAS PubMed Google Scholar
Bohan A, Peter JB. Polymyositis and dermatomyositis (first of two parts). N Engl J Med. 1975;292(7):344–7.
Article CAS PubMed Google Scholar
Preliminary criteria for the classification of systemic sclerosis (scleroderma). Subcommittee for scleroderma criteria of the American Rheumatism Association Diagnostic and Therapeutic Criteria Committee. Arthritis Rheum. 1980;23(5):581–90.
Aletaha D, Neogi T, Silman AJ, Funovits J, Felson DT, Bingham CO 3rd, et al. 2010 Rheumatoid arthritis classification criteria: an American College of Rheumatology/European League Against Rheumatism collaborative initiative. Arthritis Rheum. 2010;62(9):2569–81.
Article PubMed Google Scholar
Hu Y, Wang LS, Wei YR, Du SS, Du YK, He X, et al. Clinical characteristics of connective tissue disease-associated interstitial lung disease in 1,044 Chinese patients. Chest. 2016;149(1):201–8.
Article PubMed Google Scholar
Brusca RM, Pinal-Fernandez I, Psoter K, Paik JJ, Albayda J, Mecoli C, et al. The ILD-GAP risk prediction model performs poorly in myositis-associated interstitial lung disease. Respir Med. 2019;150:63–5.
Article PubMed PubMed Central Google Scholar
Mango RL, Matteson EL, Crowson CS, Ryu JH, Makol A. Assessing mortality models in systemic sclerosis-related interstitial lung disease. Lung. 2018;196(4):409–16.
Article CAS PubMed Google Scholar
Kam MLW, Li HH, Tan YH, Low SY. Validation of the ILD-GAP model and a local nomogram in a singaporean cohort. Respiration. 2019;98(5):383–90.
Article CAS PubMed Google Scholar
Yunt ZX, Chung JH, Hobbs S, Fernandez-Perez ER, Olson AL, Huie TJ, et al. High resolution computed tomography pattern of usual interstitial pneumonia in rheumatoid arthritis-associated interstitial lung disease: relationship to survival. Respir Med. 2017;126:100–4.
Article PubMed Google Scholar
Kim EJ, Elicker BM, Maldonado F, Webb WR, Ryu JH, Van Uden JH, et al. Usual interstitial pneumonia in rheumatoid arthritis-associated interstitial lung disease. Eur Respir J. 2010;35(6):1322–8.
Article CAS PubMed Google Scholar
Chung JH, Chawla A, Peljto AL, Cool CD, Groshong SD, Talbert JL, et al. CT scan findings of probable usual interstitial pneumonitis have a high predictive value for histologic usual interstitial pneumonitis. Chest. 2015;147(2):450–9.
Article PubMed Google Scholar
Kelly CA, Saravanan V, Nisar M, Arthanari S, Woodhead FA, Price-Forbes AN, et al. Rheumatoid arthritis-related interstitial lung disease: associations, prognostic factors and physiological and radiological characteristics–a large multicentre UK study. Rheumatology (Oxford). 2014;53(9):1676–82.
Article CAS Google Scholar
Fu Q, Wang L, Li L, Li Y, Liu R, Zheng Y. Risk factors for progression and prognosis of rheumatoid arthritis-associated interstitial lung disease: single center study with a large sample of Chinese population. Clin Rheumatol. 2019;38(4):1109–16.
Article PubMed Google Scholar
Hull RP, Goldsmith DJ. Nephrotic syndrome in adults. BMJ (Clinical research ed). 2008;336(7654):1185–9.
Article Google Scholar
Lawrence YA, Steiner JM. Laboratory evaluation of the liver. Vet Clin North Am Small Anim Pract. 2017;47(3):539–53.
Article PubMed Google Scholar
Li R, Zhu WJ, Wang F, Tang X, Luo F. AST/ALT ratio as a predictor of mortality and exacerbations of PM/DM-ILD in 1 year-a retrospective cohort study with 522 cases. Arthritis Res Ther. 2020;22(1):202.
Article CAS PubMed PubMed Central Google Scholar
Akirov A, Masri-Iraqi H, Atamna A, Shimon I. Low albumin levels are associated with mortality risk in hospitalized patients. Am J Med. 2017;130(12):1465.
Article CAS PubMed Google Scholar
Grimminger J, Ghofrani HA, Weissmann N, Klose H, Grimminger F. COPD-associated pulmonary hypertension: clinical implications and current methods for treatment. Expert Rev Respir Med. 2016;10(7):755–66.
Article CAS PubMed Google Scholar
Wang Z, Chesler NC. Pulmonary vascular mechanics: important contributors to the increased right ventricular afterload of pulmonary hypertension. Exp Physiol. 2013;98(8):1267–73.
Article PubMed PubMed Central Google Scholar
Castelino FV, Varga J. Interstitial lung disease in connective tissue diseases: evolving concepts of pathogenesis and management. Arthritis Res Ther. 2010;12(4):213.
Article PubMed PubMed Central Google Scholar
Vij R, Strek ME. Diagnosis and treatment of connective tissue disease-associated interstitial lung disease. Chest. 2013;143(3):814–24.
Article CAS PubMed PubMed Central Google Scholar
Witt LJ, Demchuk C, Curran JJ, Strek ME. Benefit of adjunctive tacrolimus in connective tissue disease-interstitial lung disease. Pulm Pharmacol Ther. 2016;36:46–52.
Article CAS PubMed PubMed Central Google Scholar
Young A, Koduri G, Batley M, Kulinskaya E, Gough A, Norton S, et al. Mortality in rheumatoid arthritis. Increased in the early course of disease, in ischaemic heart disease and in pulmonary fibrosis. Rheumatology. 2007;46(2):350–7.
Article CAS PubMed Google Scholar
Olson AL, Swigris JJ, Sprunger DB, Fischer A, Fernandez-Perez ER, Solomon J, et al. Rheumatoid arthritis-interstitial lung disease-associated mortality. Am J Respir Crit Care Med. 2011;183(3):372–8.
Article PubMed PubMed Central Google Scholar
Tibshirani R. The lasso method for variable selection in the Cox model. Stat Med. 1997;16(4):385–95.
Article CAS PubMed Google Scholar
Ley B, Ryerson CJ, Vittinghoff E, Ryu JH, Tomassetti S, Lee JS, et al. A multidimensional index and staging system for idiopathic pulmonary fibrosis. Ann Intern Med. 2012;156(10):684–91.
Article PubMed Google Scholar
Jones PW, Quirk FH, Baveystock CM. The St George’s respiratory questionnaire. Respir Med. 1991;85(Suppl B):25–31.
Article PubMed Google Scholar
Schurink CAM, Nieuwenhoven CAV, Jacobs JA, Rozenberg-Arska M, Joore HCA, Buskens E, et al. Clinical pulmonary infection score for ventilator-associated pneumonia: accuracy and inter-observer variability. Intensive Care Med. 2004;30(2):217–24.
Article PubMed Google Scholar
Valencia M, Badia JR, Cavalcanti M, Ferrer M, Agusti C, Angrill J, et al. Pneumonia severity index class v patients with community-acquired pneumonia: characteristics, outcomes, and value of severity scores. Chest. 2007;132(2):515–22.
Article PubMed Google Scholar

Download references

Acknowledgements

Not applicable.

Funding

This study was supported by National Natural Science Foundation of China (U1904142, 82000015), Scientific and technological projects of Science and Technology Department of Henan Province (182102410010), Key Scientific Research Project of Colleges and Universities in Henan Province (18A320056).

Author information

Authors and Affiliations

Department of Respiratory and Critical Care Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, The People’s Republic of China
Di Sun, Yu Wang, Qing Liu, Tingting Wang, Pengfei Li, Tianci Jiang, Lingling Dai, Liuqun Jia, Wenjing Zhao & Zhe Cheng

Authors

Di Sun
View author publications
You can also search for this author in PubMed Google Scholar
Yu Wang
View author publications
You can also search for this author in PubMed Google Scholar
Qing Liu
View author publications
You can also search for this author in PubMed Google Scholar
Tingting Wang
View author publications
You can also search for this author in PubMed Google Scholar
Pengfei Li
View author publications
You can also search for this author in PubMed Google Scholar
Tianci Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Lingling Dai
View author publications
You can also search for this author in PubMed Google Scholar
Liuqun Jia
View author publications
You can also search for this author in PubMed Google Scholar
Wenjing Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Zhe Cheng
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

DS, YW, QL, TT-W, PF-L, TC-J, LL-D, LQ-J and WJ-Z selected the patients and acquired the data; DS analyzed, interpreted the data and completed the writing. YW was substantially involved in revising the article. ZC had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Zhe Cheng.

Ethics declarations

Ethics approval and consent to participate

Ethical approval for this study was obtained from the Institution Review Board of the First Affiliated Hospital of Zhengzhou University (2019-KY-116).

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Sun, D., Wang, Y., Liu, Q. et al. Prediction of long-term mortality by using machine learning models in Chinese patients with connective tissue disease-associated interstitial lung disease. Respir Res 23, 4 (2022). https://doi.org/10.1186/s12931-022-01925-x

Download citation

Received: 28 August 2021
Accepted: 03 January 2022
Published: 07 January 2022
DOI: https://doi.org/10.1186/s12931-022-01925-x

Prediction of long-term mortality by using machine learning models in Chinese patients with connective tissue disease-associated interstitial lung disease