Skip to main content

Phenotypic subtypes of fibrotic hypersensitivity pneumonitis identified by machine learning consensus clustering analysis



Patients with fibrotic hypersensitivity pneumonitis (f-HP) have varied clinical and radiologic presentations whose associated phenotypic outcomes have not been previously described. We conducted a study to evaluate mortality and lung transplant (LT) outcomes among clinical clusters of f-HP as characterized by an unsupervised machine learning approach.


Consensus cluster analysis was performed on a retrospective cohort of f-HP patients diagnosed according to recent international guideline. Demographics, antigen exposure, radiologic, histopathologic, and pulmonary function findings along with comorbidities were included in the cluster analysis. Cox proportional-hazards regression was used to assess mortality or LT risk as a combined outcome for each cluster.


Three distinct clusters were identified among 336 f-HP patients. Cluster 1 (n = 158, 47%) was characterized by mild restriction on pulmonary function testing (PFT). Cluster 2 (n = 46, 14%) was characterized by younger age, lower BMI, and a higher proportion of identifiable causative antigens with baseline obstructive physiology. Cluster 3 (n = 132, 39%) was characterized by moderate to severe restriction. When compared to cluster 1, mortality or LT risk was lower in cluster 2 (hazard ratio (HR) of 0.42; 95% CI, 0.21–0.82; P = 0.01) and higher in cluster 3 (HR of 1.76; 95% CI, 1.24–2.48; P = 0.001).


Three distinct phenotypes of f-HP with unique mortality or transplant outcomes were found using unsupervised cluster analysis, highlighting improved mortality in fibrotic patients with obstructive physiology and identifiable antigens.


Hypersensitivity pneumonitis (HP) is an immune-mediated interstitial lung disease characterized by injury from inhaled organic or inorganic antigens [1, 2]. The 2020 ATS/JRS/ALAT clinical practice guideline categorizes HP into fibrotic and non-fibrotic subtypes based on radiologic or histopathologic findings [1]. Patients with fibrotic hypersensitivity pneumonitis (f-HP) have worse survival compared to non-fibrotic with an all-cause mortality rate of 67.5 per 1000 person-years [3]. Identification and avoidance of causative antigens has recently been described as associated with better survival in those with fibrotic disease [4]. Exposure type (e.g., avian vs. mold vs. bacterial) may also be associated with differential outcomes [4]. Specific radiologic findings among patients with lung fibrosis may be correlated with lower forced vital capacity (FVC) or lung function [5]. Although multiple studies have reported the association of specific clinical domains with survival in f-HP, concomitant domains or phenotype analyses have not been previously described.

Machine learning and artificial intelligence have advanced the diagnostic and prognostic association of clinical parameters in medicine. Prior cohort studies have found specific variables are associated with outcome, though have not incorporated them into phenotypic subgroups or structuring. An additional benefit of phenotyping may be tailoring treatments according to subgroup characteristics, particularly in the context of heterogeneously presenting diseases like HP. Recent studies have shown that clustering methodology may differentiate unique phenotypes with distinct clinical courses or outcomes [6,7,8]. We conducted a study using unsupervised machine learning to identify clinical phenotypes in f-HP and assess their comparative mortality and transplant risk.


Subject selection

This study is a single-center retrospective cohort conducted at Mayo clinic Rochester. Suspected f-HP patients diagnosed between January 2005 and December 2020 were identified using a computer-assisted search. Each medical record was reviewed by study investigators to verify exposure history, serum specific IgG testing, radiologic findings, bronchoalveolar lavage analysis, and histopathology if obtained. Patients were identified as having identifiable causative antigens if there was documentation of suspected environmental exposure regardless of serum specific IgG testing. Final diagnosis of f-HP was based on the 2020 ATS/JRS/ALAT clinical practice guideline [1] highlighting specific levels of diagnostic confidence. Diagnoses were categorized as definite (level of confidence ≥ 90%), high (80–89%), moderate (70–79%), or low confidence (51–69%). Patients with diagnostic confidence < 50% or missing baseline pulmonary function testing (PFT) were excluded. Our study was approved by Mayo Clinic Institutional Review Board (approval No. 20–000211).

Data collection

In addition to diagnostic variables, age, sex, smoking status, body mass index (BMI), presenting PFTs as percent predicted findings for total lung capacity (TLC%), forced vital capacity (FVC%), forced expiratory volume in the first second (FEV1%), FEV1/FVC ratio, diffusion capacity for carbon monoxide (DLCO%), and selected comorbidities (see Table 1) were collated. Missing non-PFT data were imputed by the Random Forest method [9]. Radiologic findings included presence of mosaic attenuation, honeycombing, and those with probable or consistent usual interstitial pneumonia (UIP) high resolution computed tomography (HRCT) patterns. Dates of death, LT, or last follow-up were used to assess long-term outcomes.

Table 1 Baseline characteristics of fibrotic hypersensitivity pneumonitis patients as classified by cluster

Clustering analysis

We used an unsupervised machine learning consensus clustering approach to identify clinical subtypes of patients with f-HP [10]. A pre-specified subsampling parameter of 80% with 100 iterations was pursued. The number of potential clusters (k) was set to a range of two to ten to avoid excessive cluster numbers and clinically irrelevant groupings. The optimal number of clusters was determined by a consensus matrix (CM) heat map, cumulative distribution function (CDF), cluster-consensus plots in the within-cluster consensus scores, and proportion of ambiguously clustered (PAC) pairs. The within-cluster consensus score, an average consensus value for all pairs of individuals in the same cluster, ranged between 0 and 1 [11]. A value closer to 1 indicated better cluster stability. PAC, ranging between 0 and 1, was defined as the proportion of all sample pairs with consensus values falling within the predetermined boundaries [12]. A value closer to zero indicated better cluster stability [12]. Additional details of consensus clustering algorithms are described in the Supplementary file.

Statistical analysis

After cluster identification, we compared baseline characteristics between each cluster using analysis of variance (ANOVA) and Chi-square for continuous and categorical variables, respectively. The standardized mean differences of clinical characteristics between each cluster and the whole cohort was used to determine specific clinical characteristics for each cluster. Variables with an absolute standardized mean difference of > 0.3 were considered key characteristics of the cluster.

Association of each cluster with transplant-free survival was evaluated using Cox proportional hazard regression analysis reported as a hazard ratio (HR) with 95% confidence interval (CI). Survival status and lung transplantation were ascertained through medical record review and cross-matched with a United States Social Security Death Index (USSDI) search. Since all baseline characteristics were considered for cluster development, we did not adjust for specific variables in the model. P values of < 0.05 were considered statistically significant. All analyses were performed using R, version 4.0.3 (RStudio, Inc., Boston, MA, USA), with the ConsensusClusterPlus package (version 1.46.0) for consensus clustering analysis and the missForest package for imputation of missing data [9].


Of 779 patients with suspected f-HP evaluated between January 2005 and December 2020, 448 were compatible with f-HP based on 2020 ATS/JRS/ALAT guideline. Seventy-one and forty-one patients were excluded respectively for diagnostic confidence < 50% and missing baseline PFTs. A total of 336 f-HP patients were included in the final analysis (Fig. 1) with a mean age of 65.3 ± 10.9 years. Approximately half were male and had a history of smoking. Definite diagnosis of f-HP was confirmed in 133 (49.6%) with causative antigen exposures identified in 60% of the total cohort.

Fig. 1
figure 1

Patient selection

Consensus clustering analysis was applied to the final set of f-HP patients meeting inclusion criteria. A CDF plot provides the consensus distributions for each cluster (Fig. 2A). A delta area plot shows relative change in the area under the CDF curve (Fig. 2B). The greatest changes in area were identified between k = 3 and k = 5. As shown on the CM heatmap (Fig. 2C, supplementary Figs. 19), the ML algorithm identified cluster 3 with distinct borders, demonstrating high cluster stability across repeated iterations. The mean cluster consensus score was highest for three clusters (mean consensus score of 0.90) (Fig. 3A) with favorable low PACs demonstrated for cluster 3 (Fig. 3B). Overall, consensus clustering analysis identified three clinically distinct phenotypes.

Of the 336 f-HP patients, 158 (47.0%), 46 (13.7%), and 132 (39.3%) were classified into clusters 1, 2, and 3, respectively. Baseline characteristics of the three clusters are presented in Table 1. Variables differing among the three included age, BMI, diagnostic confidence, causative antigen identification, baseline PFT findings, and OSA as a comorbidity. The standardized mean difference plot was used to identify key clinical characteristics of each cluster, as presented in Fig. 4.

Fig. 2
figure 2

(A) CDF plot displaying consensus distributions for each K. Each color represents a specific number of clusters. (B) Delta area plot (x-axis (k) signifies the number of clusters). The plot demonstrates relative changes in area beneath the CDF curve with increasing numbers of clusters. (C) Consensus matrix heat map depicting consensus values on a white to blue color scale of each cluster

Fig. 3
figure 3

(A) The bar plot displays the mean consensus score for different numbers of clusters, where k ranges from two to ten. Each colored bar within a specific number represents an individual cluster from separate clustering simulations. This iterative approach was adopted to evaluate stability and consistency of the clustering results. (B) The PAC values assess ambiguously clustered pairs

Fig. 4
figure 4

The standardized mean difference plot identifying clinical characteristics of each cluster

Cluster 1 were more likely to have preserved pulmonary function defined by only slightly decreased mean FVC (78.2%predicted) and TLC (78.4%predicted), despite being the oldest of the three clusters in terms of age at presentation (mean age 68 ± 9.7 year). Mean DLCO was 55.8% of predicted, comparable to cluster 2 but significantly higher than cluster 3. Cluster 2 had lower mean age (60.9 years) and BMI (27.5 kg/m2) with more causative antigen identification (84.8%), particularly to avian and hot tub exposure. PFT findings were also more obstructive with air trapping, lower mean FEV1/FVC ratio (0.69), FEV1 (57.8%predicted), and FEF25 − 75% (44.5%predicted). Higher mean RV (139.9%predicted), TLC (90.8%predicted), and RV/TLC (151.8%predicted) were also found compared to the other two clusters. Cluster 3 had more severe restriction, with lower mean FVC (51.1%predicted), RV (68.6%predicted), TLC (59.2%predicted), and DLCO (40.5%predicted). Characteristics of the entire cohort and each cluster are presented in Fig. 4; Table 1.

Treatment details are presented in Table 1. With respect to therapeutic interventions, patients in Cluster 2 were more likely not to receive treatment of any kind (26%), including corticosteroid and steroid-sparing agents. Significantly higher antigen avoidance was also observed in this cluster (50%).

Of those in cluster 1, 53 (33.5%) died and 12 (7.6%) underwent lung transplantation. In cluster 2, 10 (21.7%) died and 2 (4.3%) underwent lung transplantation. In cluster 3, 60 (45.5%) died and 11 (8.3%) underwent lung transplantation. When compared to cluster 1, risk of lung transplantation or death was significantly lower for cluster 2 (hazard ratio (HR) 0.42; 95% CI, 0.21–0.82; P = 0.01), and significantly higher for cluster 3, (HR 1.76; 95% CI, 1.24–2.48; P = 0.001). Kaplan-Meier survival curves for the three clusters are presented in Fig. 5.

Fig. 5
figure 5

Kaplan-Meier survival curves comparing transplant-free survival among each cluster


Phenotypic characterization resulting in prognostic or differential outcomes has not been previously described in patients with f-HP. Individual clinical parameters have been reported as relevant to predicting outcome (exposure history, lung function, and radiologic findings), though such findings may be heterogenous or present variably among diverse sets of patients [4, 5, 13]. A cluster algorithm approach may identify groups of similar patients using a wide-ranging set of clinical characteristics [6]. A primary advantage of cluster analysis is the potential discovery of new or unexpected disease patterns which may not be intuitive or difficult to characterize due to multifaceted or overlapping presentations. In this study, an unsupervised ML consensus clustering algorithm identified three distinct clusters of f-HP patients based on presenting findings. Key features of each cluster were highlighted by pulmonary function and causative antigen exposure history, despite the inclusion of multiple clinical variables and comorbidities in the analysis. Importantly, the three clusters translated to separate transplant-free survival in the setting of typical treatment or antigen avoidance strategies.

Cluster 1 accounted for most of the f-HP patients included in our cohort (47.0%). Patients in this group had mild restrictive pulmonary physiology with slightly decreased mean FVC and DLCO, despite older age at presentation. Mortality or transplant outcomes were observed on average after ten or more years of follow-up. Higher pulmonary function may represent earlier diagnosis, though the subsequently longer survival seen here may represent slower progression or better response to subsequent antigen avoidance or treatment. Similarly, Cluster 3, characterized by more severe restrictive physiology, may also represent more advanced or late-stage disease despite younger age at presentation, as f-HP may present at any age. Baseline FVC and DLCO have been previously described as outcome predictors in f-HP [14, 15]. Notably, UIP HRCT pattern (6 vs. 8%) and honeycombing (18 vs. 21%) were found with similar frequency between the two groups.

Our study found patients in Cluster 2 were uniquely characterized by obstructive physiology on PFTs. The impact of obstruction on outcome or its relation to other clinical characteristics remains unclear in patients with f-HP. Obstruction may be seen in HP as an acute or earlier manifestation of small airways involvement. Mosaic attenuation or expiratory air trapping, typical HRCT findings in f-HP, may also represent small airway involvement with physiologic obstruction [16]. Zuniga et al. found patients with f-HP had improvement in likely small airway-related obstruction, as characterized by a decrease in the phase 3 slope of ultrasonic pneumography, after immunosuppressive treatment [17]. Obstructive physiology might represent active and potentially reversible small airways inflammation or injury responsive to therapy, and perhaps better survival.

Patients in cluster 2 were also younger, had lower BMI, and higher rates of identifiable causative antigen, particularly to avian or hot tub exposure. Younger age, identifiable causative exposure, and antigen avoidance have been previously reported as associated with improved mortality [4, 13, 18]. Our study confirms findings from a previous report demonstrating better survival in patients with history of avian antigen exposure [13]. Since exposure to avian antigens or hot tubs is often easier to identify and avoid, such patients might also have better outcomes. Additionally, compared to clusters 1 and 3, mosaic attenuation occurred more frequently. Honeycombing was also found in 11%, with none having typical or probable UIP HRCT patterns.

As discussed, cluster analysis not only identifies distinctive presenting characteristics inherent to a particular group but may also derive guidance for tailoring appropriate treatment according to associated disease progression or survival outcome. Our study found that patients in Cluster 2 had more favorable outcomes with nearly 30% abstaining from any medical treatment. In contrast, patients in Cluster 3 experienced worse survival despite nearly all receiving initial corticosteroids and half going on to long-term steroid-sparing agents. The earlier use of antifibrotics when meeting criteria for progressive pulmonary fibrosis (as suggested for Cluster 3) may be an appropriate treatment strategy.

Our study has several limitations. First, selection bias is possible with the use of a single tertiary referral center and patients evaluated over a decade or more of clinical experience. Despite the systematic use of recent international consensus criteria to align diagnostic uncertainty, historical practices and their evolution over time may limit the availability of all clinical parameters. Original multidisciplinary team discussions were not documented for all patients; however, extensive clinical, HRCT, and pathological reports were available for defining diagnostic confidence levels according to the 2020 ATS/ERS/JRS/ALAT guideline. Excluded patients who did not have baseline PFTs (N = 41) were also younger with a higher proportion of ‘definite’ HP diagnostic confidence levels (supplement Table S1), which might impact the current analyses if included. While a broad range of clinical variables were included in the cluster analysis, there may still be unaccounted or unknown factors that may impact or change current phenotypic characterizations, including timing of symptom onset. Finally, an all-cause mortality endpoint may not entirely represent the direct impact of disease progression from f-HP but contribution from other unrelated comorbidities or complications. We attempted to account for this with the inclusion of selected comorbidities in the clustering model, of which none appeared to be distinguishing.


We identified three distinct phenotypes of f-HP using an unsupervised machine learning consensus clustering approach. These three clusters, as characterized by pulmonary function testing (mild vs. more severe restriction vs. obstruction) and identifiable antigen exposure history, translated to unique transplant-free survival outcomes.

Data availability

The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.


  1. Raghu G, Remy-Jardin M, Ryerson CJ, Myers JL, Kreuter M, Vasakova M, Bargagli E, Chung JH, Collins BF, Bendstrup E, et al. Diagnosis of hypersensitivity pneumonitis in adults. An Official ATS/JRS/ALAT Clinical Practice Guideline. Am J Respir Crit Care Med. 2020;202:e36–e69.

    Article  PubMed  PubMed Central  Google Scholar 

  2. Vasakova M, Morell F, Walsh S, Leslie K, Raghu G. Hypersensitivity pneumonitis: perspectives in diagnosis and management. Am J Respir Crit Care Med. 2017;196:680–9.

    Article  PubMed  Google Scholar 

  3. Fernández Pérez ER, Kong AM, Raimundo K, Koelsch TL, Kulkarni R, Cole AL. Epidemiology of hypersensitivity pneumonitis among an insured population in the United States: a claims-based cohort analysis. Annals of the American Thoracic Society. 2018;15:460–9.

    Article  PubMed  Google Scholar 

  4. Fernandez Perez ER, Swigris JJ, Forssen AV, Tourin O, Solomon JJ, Huie TJ, Olson AL, Brown KK. Identifying an inciting antigen is associated with improved survival in patients with chronic hypersensitivity pneumonitis. Chest. 2013;144:1644–51.

    Article  PubMed  PubMed Central  Google Scholar 

  5. Salisbury ML, Gu T, Murray S, Gross BH, Chughtai A, Sayyouh M, Kazerooni EA, Myers JL, Lagstein A, Konopka KE, et al. Hypersensitivity pneumonitis: radiologic phenotypes are Associated with distinct survival time and pulmonary function trajectory. Chest. 2019;155:699–711.

    Article  PubMed  Google Scholar 

  6. Rodriguez MZ, Comin CH, Casanova D, Bruno OM, Amancio DR, Costa LDF, Rodrigues FA. Clustering algorithms: a comparative approach. PLoS ONE. 2019;14:e0210236.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Prior TS, Walscher J, Gross B, Bendstrup E, Kreuter M. Clusters of comorbidities in fibrotic hypersensitivity pneumonitis. Respir Res. 2022;23:368.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Wong AW, Lee TY, Johannson KA, Assayag D, Morisset J, Fell CD, Fisher JH, Shapera S, Gershon AS, Cox G, et al. A cluster-based analysis evaluating the impact of comorbidities in fibrotic interstitial lung disease. Respir Res. 2020;21:322.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Stekhoven DJ, Buhlmann P. MissForest–non-parametric missing value imputation for mixed-type data. Bioinformatics. 2012;28:112–8.

    Article  CAS  PubMed  Google Scholar 

  10. Monti S, Tamayo P, Mesirov J, Golub T. Consensus Clustering: a resampling-based method for Class Discovery and visualization of gene expression microarray data. Mach Learn. 2003;52:91–118.

    Article  Google Scholar 

  11. Wilkerson MD, Hayes DN. ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking. Bioinformatics. 2010;26:1572–3.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Șenbabaoğlu Y, Michailidis G, Li JZ. Critical limitations of consensus clustering in class discovery. Sci Rep. 2014;4:6207.

    Article  PubMed  PubMed Central  Google Scholar 

  13. De Sadeleer LJ, Hermans F, De Dycker E, Yserbyt J, Verschakelen JA, Verbeken EK, Verleden GM, Wuyts WA. Effects of Corticosteroid Treatment and Antigen Avoidance in a large hypersensitivity pneumonitis cohort: a single-centre cohort study. J Clin Med 2018, 8.

  14. Moua T, Petnak T, Charokopos A, Baqir M, Ryu JH. Challenges in the diagnosis and management of Fibrotic Hypersensitivity Pneumonitis: a practical review of current approaches. J Clin Med 2022, 11.

  15. Ojanguren I, Morell F, Ramon MA, Villar A, Romero C, Cruz MJ, Munoz X. Long-term outcomes in chronic hypersensitivity pneumonitis. Allergy. 2019;74:944–52.

    Article  PubMed  Google Scholar 

  16. Dias OM, Baldi BG, Chate RC, Ribeiro de Carvalho CR, Dellaca RL, Milesi I, Pereira de Albuquerque AL. Forced oscillation technique and small Airway involvement in chronic hypersensitivity pneumonitis. Arch Bronconeumol (Engl Ed). 2019;55:519–25.

    Article  PubMed  Google Scholar 

  17. Guerrero Zuniga S, Sanchez Hernandez J, Mateos Toledo H, Mejia Avila M, Gochicoa-Rangel L, Miguel Reyes JL, Selman M, Torre-Bouscoulet L. Small airway dysfunction in chronic hypersensitivity pneumonitis. Respirology. 2017;22:1637–42.

    Article  PubMed  Google Scholar 

  18. Gimenez A, Storrer K, Kuranishi L, Soares MR, Ferreira RG, Pereira CAC. Change in FVC and survival in chronic fibrotic hypersensitivity pneumonitis. Thorax. 2018;73:391–2.

    Article  PubMed  Google Scholar 

Download references





Author information

Authors and Affiliations



TP, TM, WC, and CT contributed equally to study design and conceptualization data procurement, and analysis. TP, TM, TS, and ST contributed to manuscript writing. TP and TM are responsible for the integrity and completeness of this work.

Corresponding author

Correspondence to Teng Moua.

Ethics declarations

Ethics approval and consent to participate

Mayo Clinic Institutional Review Board (IRB) approval was required before study initiation (approval No. 20–000211).

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Petnak, T., Cheungpasitporn, W., Thongprayoon, C. et al. Phenotypic subtypes of fibrotic hypersensitivity pneumonitis identified by machine learning consensus clustering analysis. Respir Res 25, 41 (2024).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: