Five novel clinical phenotypes for critically ill patients with mechanical ventilation in intensive care units: a retrospective and multi database study

Background Although protective mechanical ventilation (MV) has been used in a variety of applications, lung injury may occur in both patients with and without acute respiratory distress syndrome (ARDS). The purpose of this study is to use machine learning to identify clinical phenotypes for critically ill patients with MV in intensive care units (ICUs). Methods A retrospective cohort study was conducted with 5013 patients who had undergone MV and treatment in the Department of Critical Care Medicine, Peking Union Medical College Hospital. Statistical and machine learning methods were used. All the data used in this study, including demographics, vital signs, circulation parameters and mechanical ventilator parameters, etc., were automatically extracted from the electronic health record (EHR) system. An external database, Medical Information Mart for Intensive Care III (MIMIC III), was used for validation. Results Phenotypes were derived from a total of 4009 patients who underwent MV using a latent profile analysis of 22 variables. The associations between the phenotypes and disease severity and clinical outcomes were assessed. Another 1004 patients in the database were enrolled for validation. Of the five derived phenotypes, phenotype I was the most common subgroup (n = 2174; 54.2%) and was mostly composed of the postoperative population. Phenotype II (n = 480; 12.0%) led to the most severe conditions. Phenotype III (n = 241; 6.01%) was associated with high positive end-expiratory pressure (PEEP) and low mean airway pressure. Phenotype IV (n = 368; 9.18%) was associated with high driving pressure, and younger patients comprised a large proportion of the phenotype V group (n = 746; 18.6%). In addition, we found that the mortality rate of Phenotype IV was significantly higher than that of the other phenotypes. In this subgroup, the number of patients in the sequential organ failure assessment (SOFA) score segment (9,22] was 198, the number of deaths was 88, and the mortality rate was higher than 44%. However, the cumulative 28-day mortality of Phenotypes IV and II, which were 101 of 368 (27.4%) and 87 of 480 (18.1%) unique patients, respectively, was significantly higher than those of the other phenotypes. There were consistent phenotype distributions and differences in biomarker patterns by phenotype in the validation cohort, and external verification with MIMIC III further generated supportive results. Conclusions Five clinical phenotypes were correlated with different disease severities and clinical outcomes, which suggested that these phenotypes may help in understanding heterogeneity in MV treatment effects.


Background
Ventilator-associated lung injury (VALI) commonly occurs in patients hospitalized in the intensive care units (ICUs). A growing body of studies have shown that VALI is not limited to patients with acute respiratory distress syndrome (ARDS) but might occur in patients undergoing mechanical ventilation (MV) without lung injury; improper placement of the ventilator can also cause VALI [1,2]. The population undergoing surgeries under anesthesia requiring short-term ventilation is a welldocumented group. Each year, approximately 230 million patients worldwide are administered MV for surgical anesthesia. Most of these patients may have no basic lung disease before surgery, but after a short period of MV, the incidence of postoperative pulmonary complications has been shown to be 5% to 40%, and it has been shown to occur in patients undergoing chest or abdominal surgery in particular; these patients are more likely to develop postoperative edema, consolidation, and even ARDS [3]. As it is the most serious postoperative pulmonary complication, ARDS is the leading cause of postoperative respiratory failure [2,3]. It is worth noting that in patients without lung injury who receive MV for more than 48 h, the incidence of ARDS within 5 days is nearly 20% [4]. Therefore, identifying distinct clinical phenotypes and adopting different ventilation strategies for different patients accordingly may potentially benefit both critical care clinical research and practices.
Previous clinical studies of MV in critically ill patients have analyzed MV as a homogeneous disease or only discussed how to perform pulmonary protective ventilation in patients with ARDS [5][6][7]. In addition, the current methods of selection, assessment, and intervention for MV in the ICU rarely consider distinct clinical phenotypes in patients receiving MV. However, distinct clinical phenotypes of MV are actually associated with different disease severities and prognoses and may be caused by a variety of underlying mechanisms. It is unclear whether various phenotypes of MV should be distinguished when clinicians administer MV treatments. There is no publicly available evidence showing whether clinicians should only focus on small tidal volume lung protection or a certain target, such as the plateau pressure, tidal volume (VT), or positive end-expiratory pressure (PEEP), when they treat critically ill patients. To resolve the practical problems encountered in the clinic and overcome the bottleneck of traditional studies, we used a database and machine learning based method to explore the significance of MV phenotypes. Perhaps there are some potential phenotypes relevant to disease severity and patient outcomes that may contribute to MV therapy. The model can predict which phenotype the patient is and guide the next step of treatment.

Patient data collection
Using the administrative database of Peking Union Medical College Hospital and the data from our previously published study in Critical Care Medicine, we conducted a retrospective study of patients undergoing MV [8]. From May 2013 to December 2016, we identified MV patients admitted to the ICU of Peking Union Medical College Hospital. Patients younger than 18 years of age or admitted to the ICU were excluded from the 24-h period. The Institute of Institutional Research and Ethics of Peking Union Medical College Hospital approved this study involving human subjects. We retrieved data on a total of 5013 unique adult individuals from Peking Union Medical College Hospital who received MV during their ICU stay. We randomly divided this batch of data into two groups according to a ratio of 4:1. A total of 4009 patients were enrolled as training cohort, and 1004 patients were collected as validation cohort.

Ventilator mode selection
Lung-protective strategies for MV were performed on all of the patients who were admitted to the ICU. When the patients were under adequate sedation and analgesia but without spontaneous breathing, volume-controlled or pressure-controlled ventilation was used. Once the patient had spontaneous breathing, controlled ventilation was immediately converted to pressure support ventilation.

Statistical methods
We selected 22 candidate variables based on their relevance to MV. These mainly included demographic characteristics (age and temperature (T)), scores (acute physiology and chronic health evaluation (APACHE) II scores and sequential organ failure assessment (SOFA) scores), respiratory variables (respiratory rate (RR), oxygen concentration in the inhalation gas (FiO 2 ), pulse oxygen saturation (SpO 2 %), mean airway pressure (Pmean), peak airway pressure (Ppeak), positive end-expiratory pressure (PEEP), tidal volume (VT), partial pressure of oxygen (pO 2 ) and partial pressure of carbon dioxide (pCO 2 )), circulatory and perfusion variables (heart rate (HR), mean arterial pressure (MAP), central venous pressure (CVP), static arterial carbon dioxide partial pressure (Pv-aCO 2 ), lactate, P(v-a)CO 2 /C(a-v)O 2 ratio, perfusion index (PI) and hemoglobin (Hb)) and a persistence variable (fluid balance). For each variable, we computed the mean value during the first 24 h of hospital presentation. To derive the phenotypes of MV, we first evaluate the distribution, missingness, and correlation of the above candidate variables. Due to the scarcity of the data samples, the missing variables were not simply eliminated. Instead, multiple imputation with chained equations was used to account for missing data. We also excluded highly correlated variables using rank-order statistics in the sensitivity analysis with threshold value of 0.75. Because the data type of the candidate variables was numerical, we used latent profile analysis (Gaussian mixture model) to recover hidden groups based on the means of the continuous variables observed. It always involved multiple constraint conditions: "model_1" involved equal variance and zero covariance, and "model_2" involved varying variance and zero covariance. An algorithm, the "expectation maximization" (EM) algorithm, was used to find the means and standard deviations of these distributions through two steps. First, we started with some initial starting guesses of the means and standard deviations. Based on these guesses, we assigned a posterior probability of being in a certain group to each patient. Second, these posterior probabilities were then used to update our guesses of the within-class parameters, which, in turn, were used to update the posterior probability, and so on until nothing seemed to change much anymore. Then, "model_1" was adopted in this manuscript. In the latent class analysis, the optimal phenotype number was confirmed using a combination of Bayesian information criteria, adequate size, high median probabilities of group membership with each phenotype, maximum entropy, and clinical features of hidden groups.
To visualize the optimal phenotype results and patterns in clinical variables, the data were analyzed with three types of plots: (1) ranked plots of the variables by the mean and standardized difference among the phenotypes, (2) t-distributed stochastic neighbor embedding (t-SNE) plots (which show multidimensional data in three dimensions), and (3) alluvial plots (which show the proportional distribution of phenotype members across specific variables).
Since the SOFA score is used to track a person's disease severity during their stay in the ICU and determine the extent of his or her organ function or rate of failure, we also performed an analysis to determine the relationship between the new phenotypic classification and the SOFA score, which included determining (1) whether the traditional SOFA score reflects the severity of the disease; (2) whether the phenotypes overlap with the SOFA score area on the chord graph; and (3) the deaths associated with each phenotype and the corresponding quartile of the SOFA score (especially in the area with overlapping phenotypes). The mortality rate of different phenotypes was significantly different across regions, which was evaluated a posteriori by a clustering phenotypic method.

External database verification
To further verify the phenotyping results yielded from Peking Union Medical College Hospital cohort, we then used another open database for external verification. The selected database is Medical Information Mart for Intensive Care III (MIMIC-III) database, which is a large, freely available database comprising deidentified health-related data associated with over 40,000 patients who stayed in critical care units of the Beth Israel Deaconess Medical Center between 2001 and 2012. According to the same inclusion criteria, data from 9457 patients were enrolled in our external verification study.

Five clinical MV phenotypes
Summary information on our patient population is shown in Table 1. Multiple imputation with chained equations was employed to fill null values for each variable. After data preprocessing, we used latent profile analysis to derive groups. It was found that the five-class model was the optimal fit, forming the probability density distributions (1), where Normal(µ l , σ ) is a Gaussian distribution with a mean value µ l and variance σ . The resulting probability density weight, mean value and variance are shown in Eq. 2.
Consequently, the probability of an observation x i categorized as the nth phenotype was (2) (3) p z = n|x i = p(z = n)Normal(µ n , σ )   (2) and (3). Taking the phenotype of the maximum probability implied that observation x i was categorized as phenotype V.
As shown in Fig. 1a and Table 2, the variables were scaled for each phenotype. Broad differences were observed in the distributions of the scaled variables across phenotypes. Of the 22 variables measured, 18 were significantly different across phenotypes with P < 0.05. Several analyses were conducted to ensure that   Table 2. To visualize the differences among all the phenotypes, t-SNE plots were used to reduce the dimensionality. Namely, nonlinear dimensionality reduction was performed in a 22-dimensional variable space to form 2-dimensional features ranging from [− 80, 70] to [− 80, 80]. Moreover, combined with the maximum probability from Eq. 3, a 3-dimensional space {x 1 , x 2 , probability} was thus formed. Therefore, as shown in Fig. 1b, we observed five distinct clusters that were formed in this 3-dimensional space, which are denoted in blue, pink, cyan, yellow, and gray and characterize the five respiratory phenotypes. This result indicated that there was strong separation in the likelihood of membership for patients assigned to a given phenotype compared with those assigned to other phenotypes. The number of patients assigned to phenotypes I, II, III, IV and V was 2174, 480, 241, 368, and 746, respectively, and the corresponding proportion of the total number of patients was 54.2%, 12.0%, 6.01%, 9.18% and 18.6%, respectively.

Severity assessments and prognosis for the 28-day survival curve
Since the APACHE II and SOFA values of phenotypes II and IV were significantly higher than those of the other phenotypes and the respiratory characteristics (RR, FiO 2 , Pmean, and Ppeak) and circulatory characteristics (HR) of phenotype IV were significantly different from those of the other four phenotypes, we used the SOFA score as an example by dividing it into four score intervals: [0, 4], (4,6], (6,9] and (9,22]. We calculated the total number of patients and the mortality of each profile when the mortality rate of the four score intervals was more than 2%. As shown in Fig. 2, considering phenotype IV, which has the highest mortality rate, the number of patients in the SOFA score segment (9,22] was 198, the number of deaths was 88, and the mortality rate was higher than 44%. The mortality of phenotype V in each score segment was less than 2%, so it was not included in Fig. 2.
Furthermore, the chord graph shown in Fig. 2 can effectively reflect the interactions among the SOFA score segment, section and the number of deaths. The upper and lower arcs of the graph correspond to the number of deaths in the section and SOFA score segment and interact with the two in the sphere with different colored surfaces. The number of deaths for phenotypes I, II, III, IV and IV was15, 4, 107, and 120, respectively, and the mortality rate was 6.1%, 1.6%, 43.4% and 48.8%, respectively. Cross-sections I, III, II and IV correspond to the upper halves of the arc denoted by the pink curved surface in a counterclockwise manner, as shown in Fig. 3. The proportions of individuals with phenotypes I, III and V among all cases were 54.2%, 6% and 18.6%, respectively, and the mortality rates for these three phenotypes were 1.23%, 1.40% and 1.59%, respectively, which were all smaller than 2.00%. The proportions of patients with phenotypes II and IV accounted for 12.0% and 9.2% of all patients, respectively, with mortality rates of 27.4% and 18.1%, respectively.

Model stability verification
The validation cohort included 1004 patients, and the summary statistics are depicted in Table 3. Predictions of the phenotype distributions were generated for the validation cohort with the above model. This result was similar to that in the above training cohort. The number of patients assigned to phenotypes I, II, III, IV and V were 542, 124, 52, 81, and 205, respectively, and the corresponding proportions of the total number of patients were 54.0%, 12.4%, 5.18%, 8.07% and 20.4%, respectively (Table 4). Compared with the profile in Fig. 1a, that in Fig. 4 shows consistent differences in the biomarker patterns. To check the phenotype population stability, a metric (population stability index, PSI) was used to identify differences shifts in the population with the following formula: Additionally, we used the PSI to measure 28-day mortality by phenotype in two datasets. The number of deaths at 28 days were 2 (1.11%), 23 (20.2%), 6 (3.85%), 25 (28.4%) and 3 (1.46%). Then, we calculated the PSI for the phenotype populations and mortality across the two datasets. The final values were 0.0045 and 0.007, which indicated an insignificant difference in both the population size and mortality rate between the validation cohort and study cohort.

External verification results
We used MIMIC III cohort for further verification. To minimize the impact of database differences, we finally chose 15 variables and divided them into four groups: The clustering results may not be completely consistent with the data of Peking Union Medical College Hospital; however, the same five profiles were generated as for the Peking Union Medical College Hospital cohort, as displayed in Fig. 5. The SpO 2 of Profile 5 was significantly lower than those of the other profiles, the PEEP values of Profiles 3 and 4 were higher than those of Profiles 1, 2 and 5, and the Pmean of Profile 4 was obviously different from those of the other profiles. While given the differences between databases in ethnic composition, recording methods and data types, the verification results demonstrated that our phenotyping method has generalizability and the model is stable to divide patients undergoing MV into five different subtypes.

Discussion
In this retrospective ICU MV patient study, we used machine learning methods to obtain five ventilatorrelated phenotypes on the day of ICU admission based on the patient's general data, ventilator parameters and partially monitored circulating perfusion indicators, as well as the in-and-out balance (residual characteristics) within 24 h of the day. For these five types of respiratory phenotypes, we used a chord diagram to suggest that the interaction among the SOFA score, profile, and death toll  showed that Profiles IV and II had the highest mortality. Survival curve analysis revealed that the 28-day survival rate of Profile IV and Profile II was significantly lower than the those of the other three profiles. We constructed five ventilator-related phenotypes with mathematical expressions using a potential profile approach and used validation data sets to predict and validate the existence of the five phenotypes. Additionally, the experiments on  Inappropriate ventilation strategies during MV result in lung injury, and the mechanisms include high airway pressures or tidal volumes resulting in lung pressure/ volume injury. Excessively low expiratory lung volumes or atelectasis lead to repeated opening and collapsing of the terminal lung unit. In addition, in patients undergoing MV, even without anatomical changes in the lung tissue, the effects of various forces can induce the release of pro-inflammatory cytokines and the recruitment of white blood cells, and the initiation of local inflammatory processes is called biological injury [9]. The first three injuries are considered to be mechanical injuries caused by mechanical factors, while the remaining injuries are caused by mechanical damage caused by secondary damage involving inflammatory cells and inflammatory mediators. Therefore, information on how to adjust and control MV is clinically important. Previous studies have shown that low tidal volumes, high PEEPs and proper control of the plateau pressure are important for resolving the current MV problem [10,11]. Even simply limiting the VT and plateau pressure is not completely safe. On the other hand, contradictory situations in which various protective strategies are contradictory may also occur in clinical practice. For example, increasing the PEEP will cause a corresponding increase in the plateau pressure when volume control mode, and often, patients with more severe lesions and poorer lung compliance will reach the an acceptable maximum limit of the plateau pressure (often known as 40 cmH 2 O) at a PEEP lower than the required level. Therefore, we urgently need to use new methods to examine traditional clinical problems.
Through analysis, we found that MV patients could be divided into five subtypes. For example, in MV patients of Peking Union Medical College Hospital, Profile IV showed a high mean airway pressure, a high peak airway pressure, a fast respiratory rate and a slow heart rate. Profile III showed a high pure PEEP. Based on the survival analysis, we can clearly see that the mortality rate was the highest for Profile IV compared with the other profiles. Namely, the prognosis of patients who show improved oxygenation by increasing the airway pressure is worse than that of patients who show improved oxygenation by increasing the PEEP without increasing the airway pressure. The occurrence of VILI depends on the interaction of the ventilator and the lungs, i.e., the pressure, volume, flow rate and frequency of the ventilator applied to the lungs and the reactivity of the lung parenchyma. The pressure generated in the lung tissue when MV is used to expand the lung is called stress, which is reflected as transpulmonary pressure. The driving pressure may better reflect the actual ventilation state in terms of the MV phenotype. Namely, with the same tidal volume, the larger the amount of functional residual gas there is, the smaller the strain generated, the smaller the probability of the occurrence of VILI, and the better the prognosis of the patient. In 2015, Amato et al. proposed that for patients with ARDS undergoing MV, the VT, PEEP, and driving pressure (DP) are independent factors [12]. Among these factors, the DP level had the strongest correlation with the survival rate of ARDS patients. The study also suggested that the PEEP and small VTs only play roles in lung protection when the DP is reduced. Neto et al. [13] found that DP is the only factor strongly correlated with postoperative pulmonary complications by performing a high-quality meta-analysis of MV risk [14]. In another study of ARDS, it was found that a DP higher than 14 cmH 2 O significantly affects the prognosis of patients [13].Our study using big data also demonstrates the role of DP in determining the phenotype of patients undergoing MV. In other words, it is only used in patients with good recruitibility to increase the PEEP, which is beneficial, and in patients with poor recruitibility, increasing the PEEP causes excessive expansion, which is harmful [11,15].
Phenotype V represents a younger but higher-scoring patient, and this phenotype may become an independent phenotype, probably due to lung compliance. The parameters of MV did not appear to have an effect on these aspects of the patients, and the prognosis of the patients were good. Phenotype II is different from phenotype III, as the patients classified as phenotype II have higher scores, but they are not considerably different in terms of age. These patients had poor organ function which may not be only reflected from the lung-related variables, so these patients may form an independent group as phenotype II. The last group is phenotype I, which represents the majority of clinical patients and the most homogeneous group. We further analyzed the departments in which the patients were admitted and found that phenotypes I, III, and V were from surgery, and phenotypes II and IV were from internal medicine.
This study also has some limitations. First, although it was a big data study, there was a training set combined with a verification set to validate five phenotypes. We used the data from the MIMIC III cohort to demonstrate our findings. However, this model still needs to be clinically verified by a prospective study. Second, the study included general information, ventilator parameters and circulation and perfusion indicators, as well as the balance of input and output (residual characteristics) within 24 h of the day, which take into account some of the indicators commonly considered in clinical use of MV. It is not known whether it is necessary to include all the indicators that can be obtained and recorded in the clinic or to combine some indicators. Third, many of the parameters included in this study are related to clinical decision making. For example, PEEP, VT and other parameters are more decisions of the clinician than the patient's own respiratory mechanics. Therefore, phenotypic analysis in different institutions may conclude different results. The current conclusion of this study requires further multicenter validation if possible.

Conclusions
In this retrospective analysis, respiratory and circulatory data were obtained from patients undergoing MV from two databases. Five ventilator-related phenotypes were obtained, and from these five phenotypes, Profiles II and IV were both related to mortality, and Profile IV was more significant in higher mortality caused by mechanical ventilation parameters. These five new identified phenotypes of critically ill patients undergoing MV may be helpful in the future identification and interpretation of clinically high-risk patients.