Skip to main content

Highly diverse sputum microbiota correlates with the disease severity in patients with community-acquired pneumonia: a longitudinal cohort study



Community-acquired pneumonia (CAP) is a common and serious condition that can be caused by a variety of pathogens. However, much remains unknown about how these pathogens interact with the lower respiratory commensals, and whether any correlation exists between the dysbiosis of the lower respiratory microbiota and disease severity and prognosis.


We conducted a retrospective cohort study to investigate the composition and dynamics of sputum microbiota in patients diagnosed with CAP. In total, 917 sputum specimens were collected consecutively from 350 CAP inpatients enrolled in six hospitals following admission. The V3-V4 region of the 16 S rRNA gene was then sequenced.


The sputum microbiota in 71% of the samples were predominately composed of respiratory commensals. Conversely, 15% of the samples demonstrated dominance by five opportunistic pathogens. Additionally, 5% of the samples exhibited sterility, resembling the composition of negative controls. Compared to non-severe CAP patients, severe cases exhibited a more disrupted sputum microbiota, characterized by the highly dominant presence of potential pathogens, greater deviation from a healthy state, more significant alterations during hospitalization, and sparser bacterial interactions. The sputum microbiota on admission demonstrated a moderate prediction of disease severity (AUC = 0.74). Furthermore, different pathogenic infections were associated with specific microbiota alterations. Acinetobacter and Pseudomonas were more abundant in influenza A infections, with Acinetobacter was also enriched in Klebsiella pneumoniae infections.


Collectively, our study demonstrated that pneumonia may not consistently correlate with severe dysbiosis of the respiratory microbiota. Instead, the degree of microbiota dysbiosis was correlated with disease severity in CAP patients.


Community-acquired pneumonia (CAP) is an acute respiratory infection acquired outside the hospital, affecting alveoli and distal airways, with variable symptoms including cough, fever, dyspnea, and expectoration [1]. The incidence of lower respiratory tract infection (LRI), which includes CAP, was 5,837 cases and 6,832 cases per 100,000 population among females and males, respectively [2]. It resulted in high morbidity and mortality rates in all age groups, especially in children and the elderly [2]. LRI remained the fourth leading cause of global years of life lost in 2019 before the COVID-19 pandemic [3].

Recent culture-independent studies revealed that the respiratory tract was not sterile in healthy individuals [4], and the lower respiratory tract microbiota contributed to the ecological and immunological homeostasis of the lung, influencing lung health and susceptibility to infections [5]. Although pathogen invasion is considered the cause of CAP, the causative agents are detected in fewer than 50% of CAP patients [4]. Studies have identified significant differences in the respiratory microbiota between CAP patients and healthy individuals, with the former being less diverse and enriched with pathogenic microbes such as Pseudomonas, Staphylococcus, and Klebsiella [6,7,8,9]. Additionally, the respiratory microbiota may influence pneumonia susceptibility via impeding colonization and immunological modulation [10, 11]. However, previous studies primarily focused on high-risk populations, such as human immunodeficiency virus (HIV) patients, lung transplant recipients, and children, and often with a small size of patients [6,7,8]. The association between respiratory microbiota and CAP in immunocompetent adults remains unclear. The interpretation is further complicated by diverse pathogens, the use of antibiotics, intubation, and corticosteroid therapies in CAP patients. Therefore, a comprehensive microbiota study in the general population, especially those untreated, is needed.

The respiratory microbiota is heterogeneous due to various host and environmental factors, including genetic background, mode of birth, feeding type, and inhaled pollutants [12,13,14]. Thus, clarifying the role of a specific variable in shaping the respiratory microbiota is challenging in cross-sectional studies. In contrast, longitudinal studies can pinpoint particular microbiota changes associated with a specific condition by controlling other covariates. Although longitudinal studies have been conducted on lung transplantation, chronic obstructive pulmonary disease (COPD), cystic fibrosis, COVID-19, and ventilator-associated pneumonia [15,16,17,18,19], studies on the lower respiratory microbiota in CAP patients are limited. Studying microbiota changes during the disease process will provide insights into the role of the respiratory microbiota in disease development.

In this study, we collected time-series sputum samples from CAP patients starting from the first day after admission, prior to therapy administration. We identified a correlation between the composition and dynamics of the sputum microbiota and disease severity, revealing distinct microbiota compositions in patients with different pathogens. This suggests that the dysbiosis of the sputum microbiota could potentially serve as a valuable diagnostic and prognostic marker for pneumonia. Furthermore, it presents a possible target for intervention in the management of the condition.


Overview of the samples and sequencing data

Longitudinal sputum samples (1,065) were collected from 367 inpatients diagnosed with CAP in six hospitals across representative geographical locations in China. Following quality filtering, 917 samples from 350 patients and 25 negative controls (NCs) were used for subsequent analysis (Fig. 1A-C). The composition of sputum microbiota of CAP patients was notably different from that of NCs (PERMANOVA, R2 = 0.2, p = 0.001, Fig. S1A), and the five most abundant taxa in NCs (Sphingomonas, Blastomonas, Methylobacterium, Bosea, and Propionibacterium, Fig. S1B) comprised 53.1% of all NCs sequence, while comprising 2.7% of all sequence in CAP samples, indicating minimal background contamination.

The median days from symptoms onset to admission were six (IQR 3–7). Forty-one (12.0%) patients had chronic pulmonary diseases, including COPD, asthma, and bronchiectasis. Before admission, 66 (19.5%) patients took antibiotics within five days, and 15 (5.5%) patients used immunosuppressants. Fifty-five (16.3%) patients were diagnosed with severe cases and seven of them died. Notably, five clinical severity indicators, including the use of invasive mechanical ventilation, CURB65 scores, pneumonia severity index (PSI) scores, duration of oxygen supplementation, and length of hospital stay, were all significantly higher in severe cases than in non-severe cases (Fisher’s exact test or Wilcoxon signed-rank test, p < 0.05). More demographic and clinical information was provided in Table 1 and Table S1. Meanwhile, 876 sputum microbiotas in Chinese healthy individuals (with no acute or chronic respiratory diseases) from three previous studies were used as the healthy controls (HCs) in the study (Table S2 and S3) [20,21,22].

Table 1 Metadata of study subjects and their correlations with the sputum microbiota on admission
Fig. 1
figure 1

Study design and sputum microbiota composition. (A) Geographic distribution of the samples. n: sample size. (B) Sampling strategy. d: days after admission. (C) Summary of the collected samples. (D) Abundance of bacteria in CAP patients and negative controls. The top 15 bacteria with the highest average relative abundance in CAP patients are shown. Bacteria that are more enriched in CAP patients than in all three HCs are labeled in red, while those enriched in HCs are labeled in blue

Sputum microbiota composition in CAP patients

Six commensal microbes that are frequently observed in the respiratory tract, including Streptococcus, Veillonella, Neisseria, Prevotella, Rothia, and Haemophilus, showed the highest relative abundance and accounted for 51.2% of CAP microbial reads, 38.0% of HC microbial reads, and 1.4% in NCs (Fig. 1D). The sputum microbiota diversity and composition in CAP patients were significantly different from HCs (PERMANOVA, mean R2 = 0.13, p < 0.001; Fig. S1A). Possible pathogens, including Pseudomonas, Enterobacteriaceae, Sphingomonas, and Stenotrophomonas, were significantly enriched in CAP samples compared to all three HC populations (Fig. S1C).

The major component of sputum microbiotas in CAP patients showed significant heterogeneity among different individuals (Fig. 1D). Employing clustering algorithms on the microbiota data revealed the presence of nine distinct clusters (CSs) (Fig. 2A, Fig. S2A). The robustness of these clusters was confirmed by bootstrap analysis (Fig. S2B, mean Rand index = 0.85, see Supplementary methods). These CSs could be further classified into three microbiota types: CS2, CS3, and CS4 (CS2-4), which were found in 71.1% of CAP patients, exhibited higher alpha diversity than other clusters, except for CS6 (Fig. 2B). CS2-4 were dominated by commensal bacteria (Fig. 2A) and showed higher similarity to healthy controls (Fig. 2C). In contrast, CS1, CS5, and CS7-9 (CS1,5,7,8,9) were dominated by possible pathogens. They had lower alpha diversity, higher dominant bacteria abundance, and were more distinct from HCs compared to CS2-4 (Fig. 2B-C, Fig. S2C). Microbiotas in CS6 exhibited the highest similarity to NCs and were more prevalent in specific individuals than randomly distributed (Fig. 2D, Fig. S2D-F), suggesting that the sputum samples of CS6 were either relatively sterile or challenging to collect. Additionally, CS6 did not appear to be detected more frequently in the later period of hospitalization, suggesting no association with post-admission treatment (Fig. S2G). Notably, the severity rate (incidence of severe condition) in CS2-4 patients was 12.9%, similar to CS6 (4.7%, Fisher’s exact test, p = 0.213), but significantly lower than CS1,5,7,8,9 (36.4%, Fisher’s exact test, p < 0.001).

Fig. 2
figure 2

The composition of sputum microbiota clusters in CAP patients. (A) The compositions of bacteria in different clusters. Bacteria with average relative abundances greater than 5% in at least one cluster are shown. The sample numbers in each cluster and NCs were labeled above the figure. (B) Shannon index of each cluster and HCs. (C) JSD distance between different CAP clusters and healthy individuals. The microbiota composition of the three HC groups was averaged and used as the HC to calculate the distance. (D) JSD distance between different CAP clusters and NCs. In B and C, statistical significance was determined by comparing each cluster with all the HCs, and the color of the plot in B, C, and D denotes the proportion of severe cases in each CAP cluster. * p < 0.05, ** p < 0.01, *** p < 0.001, **** p < 0.0001

Association between sputum microbiota and disease severity

We then investigated the association between clinical and demographic features and the sputum microbiota. To minimize the impact of antibiotic use and other medical interventions after admission, only 238 samples collected on the first day after admission were used for the subsequent analyses. We found that disease severity (diagnosed by the clinician, see Methods), as well as five clinical indicators that correlated with disease severity, including CURB65 scores, PSI scores, duration of oxygen supplementation, length of hospital stay, and clinical outcome, were all significantly correlated with the sputum microbiota, after controlling for the possible confounders (Table 1, confounders: variables 1–10). In individual geographic sites, the correlation with disease severity remained statistically significant in Wuhan, with the largest sample size (n = 142, PERMANOVA, R2 = 0.021, p < 0.05), suggesting that the correlation was not influenced by the differences in patient enrollment across various geographic locations. Notably, the use of antibiotics and immunosuppressive drugs before admission, as well as days from onset, showed no significant impact on the microbiota composition (Table 1). This might be attributed to the limited sample size in this study. Nevertheless, variables 1–10 in Table 1 were all included as covariates whenever applicable in subsequent multivariate analyses.

First, we found that the alpha diversity of sputum microbiotas in non-severe patients was less deviated from the three healthy cohorts compared to severe patients (Fig. 3A). Moreover, the microbiotas in 53.8% of severe patients were dominated by bacteria with abundances greater than 50%, whereas the fraction was 22.0% and 0% in non-severe patients and healthy individuals, respectively (Fig. 3B). Additionally, dominant bacteria with abundances greater than 50% comprised more possible pathogens, especially in severe patients (Fig. 3C).

The composition of sputum microbiota differed considerably between severe and non-severe patients (multivariate PERMANOVA R2 = 0.02, p < 0.05; Fig. 3D), which were both distinct from healthy individuals (PERMANOVA, p < 0.001; Fig. S3A). Microbiotas in severe patients were more disrupted relative to healthy individuals than those in non-severe patients (Fig. 3E). LEfSe analysis revealed increased abundance of possible pathogens, including Enterobacteriaceae, Acinetobacter, and Enterococcus in severe cases, while commensal bacteria, including Haemophilus, Neisseria, and Prevotella were more abundant in non-severe cases (p < 0.05, Fig. 3F). MaAsLin2 analysis identified enrichment of Enterobacteriaceae in severe cases, and its abundance was also positively correlated with the duration of oxygen supplementation, while adjusting for covariates (Fig. S3B). Additionally, a classifier utilizing the L1 regularized logistic regression model could distinguish the severe cases from non-severe cases using microbiota with moderate accuracy (AUC = 0.74; Fig. 3G). Key features selected for identifying severe cases included high abundances of Enterobacteriaceae and Corynebacterium, along with a low abundance of Neisseria. Furthermore, the analysis of patients from individual cities confirmed the enrichment of Enterobacteriaceae in severe patients (in Wuhan), suggesting that the identified signature was not an artifact due to variations in the patients enrolled from different cities.

The functional potential of the sputum microbiota was predicted using PICRUSt analysis [23]. Five of the top 10 pathways enriched in the severe cases (MaAsLin2 analysis) were related to menaquinol biosynthesis (Fig. S3C and D), with all five pathways contributed by Enterobacteriaceae. Menaquinones are involved in the post-translational modifications of proteins needed for blood coagulation [24], and their dysfunction has been proposed as a risk factor for the severity of CAP [25, 26]. Meanwhile, four pathways involving the fermentation of butanoate, primarily contributed by Porphyromonas and Fusobacteria, were enriched in non-severe cases (Fig. S3E). Butanoate has been shown to enhance T cell proliferation and activation while suppressing inflammatory reactions [27, 28].

Fig. 3
figure 3

Difference in the sputum microbiota between CAP patients with varying degrees of severity and healthy individuals. (A) Shannon index of the microbiota of CAP patients on admission and healthy individuals. (B) Distribution of the abundance of the predominant bacterium in CAP patients and HCs. The y-axis indicates the proportion of patients with a dominating bacterium whose abundance is greater than that indicated on the x-axis. The number of patients with a dominating bacterium whose abundance is greater than 0, 25%, 50%, 75%, and 100% is shown below the x-axis. The p-value was calculated by log-rank test. (C) Proportion of dominant bacteria in CAP patients and HCs. Bacterial genera and families containing at least one known (opportunistic) pathogen, are highlighted with red boxes. The numbers in brackets on the x-axis indicate the number of samples. A list of pathogenic bacteria is provided in Table S4. Samples with dominating bacterium abundance higher than 50% and lower than 50% were shown separately. (D) PCoA plot of samples from severe and non-severe patients based on the JSD distance. R2 was calculated by PERMONAVA analysis. (E) JSD distance to healthy individuals of severe and non-severe CAP patients. The microbiota composition of the three HC groups was averaged and used as the HC to calculate the distance. (F) Bacteria correlated with the disease severity identified by LEfSe (LDA score > 4, p < 0.05). (G) ROC curve for the disease severity classifier based on the L1 regularized logistic regression model. * p < 0.05, ** p < 0.01, *** p < 0.001, **** p < 0.0001

Dynamics of the sputum microbiota and its association with the disease severity

We further investigated how microbiota dynamics varied between non-severe cases and severe cases. First, the alpha diversity of microbiota in non-severe cases was significantly higher than that in severe cases at the first three time points (Fig. 4A), with no significant difference observed between different time points within the same group. Second, severe cases showed a larger longitudinal change in the microbiota composition (Fig. 4B), becoming more deviated from the initial state during hospitalization (Fig. 4C). Third, neither the severe nor the non-severe patients’ sputum microbiota altered toward a healthy state during hospitalization (Fig. S4A).

Then, we explored the CS transition pattern between Day 1 and Day 5, which encompassed the largest number of sample pairs (33 severe cases and 172 non-severe cases). First, cluster switching occurred more frequently in severe cases (66.7% vs. 43.6%, Fisher’s exact test, p < 0.05, Fig. 4D). Furthermore, transmissions between different CSs were likely non-random, as all three CS8 samples on Day 1 switched to CS5 on Day 5 in severe cases, whereas other CSs were rarely transmitted to CS5 (100% vs. 10.5%, Fisher’s exact test, p < 0.01, Fig. 4D). Specifically, all those three CS8 samples were dominated by Enterobacteriaceae (abundance > 62.8%) on Day 1, with abundance decreasing to less than 28.1% on Day 5, while Acinetobacter increased from less than 1.2% to more than 54% (NJ17037, NJ17043, and NJ17054 in Fig. 4E). Besides, all six severe patients with CS5 microbiota on Day 5 received invasive mechanical ventilation during hospitalization, suggesting that the expansion of Acinetobacter might be associated with secondary infection following the use of invasive mechanical ventilation. However, not all intubated patients transmitted to CS5 (6 out of 11; Fig. 4F) despite that the probability is much higher than that in non-intubated patients (54.6% vs. 2.1%, Fisher’s exact test, p < 0.01).

To explore the association between the dynamics of microbial interaction and disease severity in CAP, correlation networks were constructed for samples collected at different time points and in different groups. We found that the interactions between bacteria were remarkably sparser (with a small number of edges and degrees in the network) in severe patients than in non-severe patients at all time points (Fig. 4G, Fig. S4B). Meanwhile, we noted that the network contained more potential pathogens, such as Enterobacteriaceae, in severe patients compared to non-severe patients and HCs (Fig. S4B and C), suggesting a possible dysbiotic state of the sputum microbiota in severe patients. Furthermore, the number of network connections in the severe group decreased markedly but remained unchanged in the non-severe group, indicating that the sputum microbiota in severe patients may become more disordered during hospitalization.

Fig. 4
figure 4

Dynamics of the sputum microbiota and its association with disease severity. (A) Shannon index of sputum microbiota in severe cases and non-severe cases at different time points after admission. (B) Differences in JSD distance between two consecutive samples from severe and non-severe cases. (C) JSD distance between samples on admission and samples collected at different sampling time points. (D) The transitional Sankey diagram of different microbiota clusters from day1 to day5 in severe and non-severe CAP cases. Outliers are samples that could not be assigned to any of the nine clusters. (E) The microbiota composition of six severe patients whose microbiota belonged to CS5 on day5. (F) The transitional Sankey diagram of microbiota clusters in 11 patients underwent invasive mechanical ventilation from day1 to day5. (G) Giant component of concurrent networks constructed by SpiecEasi in severe and non-severe CAP cases at different time points. Each node denotes a bacterial microbe and the size of nodes represents the mean abundance of microbes. Black lines represent positive correlations between microbes while green lines represent negative correlations. The thickness of the lines denotes the magnitude of the correlation. The number of edges (E) and nodes (N) are shown in the Figure. The same networks with microbial labels of nodes were shown in Fig. S4B. * p.adj < 0.05, ** p.adj < 0.01, *** p.adj < 0.001, **** p.adj < 0.0001

Sputum microbiotas varied between patients infected by different pathogens

Possible pathogens were identified in 548 samples from 256 patients by the FTD® Respiratory Pathogens 33 assay (Fig. 5A). Notably, there was good consistency between the result of 16 S rRNA gene sequencing and the FTD assay (Fig. S5A). To avoid secondary infection, only 216 patients with a positive FTD result within the first three days after admission were used in subsequent analyses (11 patients positive for Pneumocystis jirovecii were excluded due to the small sample size). Ninety patients were suspected to be infected by at least one bacterial pathogen, 88 patients were suspected to be infected by viruses, and Thirty-eight patients were coinfected by both bacterial and viral pathogens (mix). We observed a significant difference in the microbiota composition between bacterial and viral infections, as well as between viral and mixed infections, and microbiotas under the three conditions were all different from that in HCs (PERMANOVA, p < 0.05, Fig. S5B), with the bacterial infection samples showed greater deviations (Fig. 5B). Different bacteria were enriched in three distinct types of infections, whereas some commensal bacteria, such as Fusobacterium, were significantly depleted in all three types (Fig. S5C).

We then classified the infections into subgroups based on the pathogen detected, considering only those infecting more than fifteen patients (Rhinovirus, Mycoplasma pneumoniae, Klebsiella pneumoniae, and Influenza A) after excluding coinfection samples. Out of 18 patients detected with Mycoplasma pneumoniae, only three exhibited a predominance of Mycoplasma in their sputum microbiota (CS7, median Mycoplasma abundance = 42.4%), while the remaining samples were dominated by respiratory commensals (14 from CS2-4, one dominated by Lautropia, median Mycoplasma abundance = 2.6%). Similarly, only one of the 18 Klebsiella pneumoniae-positive patients was assigned to Enterobacteriaceae-dominant CS5, indicating that the pathogen was not obligatory as the predominant bacterium. The microbiota composition (excluding the pathogen itself) in all four infections differed from that in HCs (PERMANOVA, p < 0.05, Fig. S5D). Although no significant difference in alpha diversity was observed between patients infected with different pathogens (Fig. S5E), the microbiota alterations relative to the HCs in Mycoplasma pneumoniae infections was less significant than in other infections (Fig. 5C). Specifically, we noted that rhinovirus infections were enriched with Enterococcus and Stenotrophomonas, influenza A infections with Acinetobacter and Pseudomonas, Mycoplasma pneumoniae with Rothia and Carnobacteriaceae, while Acinetobacter was enriched in Klebsiella pneumoniae infections (Fig. 5D). The microbiota composition differed between Mycoplasma pneumoniae infections and the Klebsiella pneumoniae infections, as well as between Mycoplasma pneumoniae infections and rhinovirus infections (PERMANOVA, R2 = 0.0.81 and 0.078, p < 0.05, Fig. S5D).

Fig. 5
figure 5

Microbiota features in patients infected with different pathogens. (A) Number of patients infected by different pathogens. (B) JSD distance to HCs for patients infected by bacteria, viruses, and mixed infection. (C) JSD distance to HCs for samples infected by Rhinovirus, Influenza A, Mycoplasma pneumoniae, and Klebsiella pneumoniae. (D) Bacteria that associated with different types of infections identified by LEfSe (|LDA| score > 4, p < 0.05). The LDA score denotes the extent of enrichment of the bacterium in the infection type that is labeled in red on the x-axis. * p < 0.05, ** p < 0.01, *** p < 0.001, **** p < 0.0001


Recent studies proposed that respiratory microbiota dysbiosis, especially low community diversity, was implicated in pneumonia development [9, 19]. However, due to the small sample size and the underrepresentation of immunocompetent patients in previous studies, characteristics of the lower respiratory microbiota in CAP patients remain largely unknown. In this study, we revealed key features of sputum microbiota in 350 CAP patients through an examination of 917 longitudinal sputum samples.

The sputum microbiota in CAP patients is highly diverse. In contrast to previous studies that identified a limited number of microbiota community types in healthy populations and patients with pneumonia or other pulmonary diseases [7, 29, 30], we identified a more heterogeneous microbiota composition in CAP patients in this study, with nine distinct microbiota clusters being identified, which may be attributed to a larger sample size, better sample representation, and diverse pathogen types. The commensal bacteria that are typically found in the respiratory tract of healthy populations make up the majority of the sputum microbiota in most patients, suggesting potential resistance or resilience of the respiratory microbiota against acute infection. Meanwhile, a sizeable proportion of samples (14.0%) had microbiota with unusually high abundances of possible pathogens, including Enterobacteriaceae, Pseudomonas, Acinetobacter, Mycoplasma, and Stenotrophomonas, all previously proposed as pneumonia-causing pathogens [19, 31,32,33,34], suggesting abnormal pathogen growth. In addition, 10.3% of samples had a microbiota predominated by non-typical pathogenic bacteria, such as Corynebacterium, Rothia, and Haemophilus, highlighting the complexity of the CAP microbiota (Fig. S2I). A special group of patients with a relatively sterile microbial community was also identified, a phenomenon previously observed in the bronchoalveolar lavage fluid of healthy individuals and COPD patients [5, 29]. However, the presence of such a low microbial load in CAP patients is unexpected, given that CAP is typically associated with the proliferation of invasive or colonized bacteria, triggering an inflammatory response [5, 35]. The severity rate of those patients was similar to that of commensals-dominated patients (CS2-4), and lower than that of the patients dominated by possible pathogens (CS1,5,7,8,9). We hypothesize that a stronger immune response or lack of sufficient resources might have suppressed the growth of both commensal and pathogenic microbes in these patients.

Second, the degree of sputum microbiota dysbiosis correlated with disease severity in CAP patients. In line with previous findings, severe CAP patients exhibited lower alpha diversity compared to healthy controls upon admission [6, 8, 9]. However, we noticed that the sputum microbiota in non-severe cases had alpha diversity less deviated from healthy controls, despite that their microbiota was still more likely to be predominated by a specific bacterium. Meanwhile, their microbiota compositions were more similar to those of healthy controls compared to severe cases. The most significant enriched bacterium in the sputum of severe cases is Enterobacteriaceae, commonly found in the gastrointestinal tract [36]. This increase may be due to the growth of colonizing bacteria or the translocation of gut bacteria to the respiratory tract, triggering systemic inflammation [37]. Furthermore, we observed a high transition rate from an Enterobacteriaceae-dominant microbiota to an Acinetobacter-dominant microbiota post-mechanical ventilation, suggesting increased vulnerability to ventilation-induced secondary infection in Enterobacteriaceae-dominant cases. Thus, a high level of Enterobacteriaceae in the sputum seems to predict a poor prognosis in CAP patients.

Third, the sputum microbiota in severe cases was more vulnerable and susceptible to significant changes during hospitalization, evidenced by higher compositional alteration, more frequent cluster switching, and more significant changes in the microbial network. This pattern resembles observations in other respiratory diseases like COPD and COVID-19 [16, 38], potentially influenced by both medical intervention and disease progression [12, 19]. However, distinguishing the specific impact of each factor is challenging. Moreover, the duration of altered microbiota and its relationship with the persistence of respiratory symptoms remain unknown, warranting a longer follow-up study for clarification.

Fourth, the alteration of sputum microbiota was associated with the infected pathogen. Rhinovirus infections exhibited enrichment of Enterococcus and Stenotrophomonas, aligning with previous studies reporting coinfection of Rhinovirus with Stenotrophomonas maltophilia or Enterococcus faecium [39, 40]. Meanwhile, influenza A infections showed enrichment of Acinetobacter and Pseudomonas, indicating a possible increased susceptibility to Acinetobacter baumannii and Pseudomonas aeruginosa after infection influenza A [41,42,43]. The underlying mechanism may involve viral infections damaging respiratory airways and concurrently impairing both innate and acquired immune responses. This creates a favorable environment for bacterial growth, adherence, and invasion into healthy sites of the respiratory tract [44]. Besides, Klebsiella pneumoniae infections, which were associated with a higher incidence of severe illness, showed more deviation from HCs (more dysbiotic) compared to Mycoplasma pneumoniae infections, which had a lower risk of severe illness. However, it is unclear to what extent the accompanying microbiota change, in addition to the pathogen’s direct influence, affects disease progression, as most cases with Klebsiella pneumoniae-positivity or Mycoplasma pneumoniae-positivity still possessed sputum microbiotas dominated by respiratory commensal. Such analysis is constrained by a small sample size and a diverse background microbiota, which could be overcome by conducting intervention experiments in animal models.

Our study has several limitations. First, pneumonia is a lung infection caused by various pathogens, hence the samples from the lungs (e.g., biopsy, bronchoalveolar lavage fluid) are particularly valuable. However, obtaining such samples involves invasive procedures, and longitudinal sampling is challenging. While sputum is commonly used as a proxy for lung samples [45], it inevitably contains upper respiratory tract microbes. The accuracy of sputum microbiota in reflecting lung microbiota is still debated [46, 47]. Second, the healthy microbiota data were obtained from three previous studies on the Chinese population, potentially differing from the population investigated in this study. We compared the CAP microbiota to different healthy datasets and reported only consistent results, making our conclusions more robust. Third, the use of antibiotics may influence sputum microbiota during hospitalization, but controlling this confounding factor is challenging as patients were not treated following the same protocol. Therefore, our analyses primarily focused on the samples taken upon admission when limited medical intervention had been applied. Fourth, the utilization of 16 S rRNA gene sequencing restricted our study to primarily assessing the relationship between the abundance of genus-level microorganisms and the disease, while the functional attributes of the sputum microbiota were merely predicted by the bioinformatic method. Further investigations employing metagenomic and metatranscriptomic technologies are warranted to elucidate the more precise role exerted by airway microorganisms in respiratory infectious diseases.


In summary, our study demonstrated diverse sputum microbiota compositions in CAP patients, with many, especially in non-severe patients, resembling those in healthy individuals. Severe CAP cases were more likely to have microbiota dominated by potentially pathogenic bacteria and underwent greater changes during hospitalization. Further studies, especially prospective and intervention studies, are needed to decipher the causality between the respiratory microbiota change and disease severity.


Patients and sample collection

Spontaneous sputum samples were collected on days 1, 3, 5, 7, and 9 after admission from 367 CAP inpatients from six hospitals (Tongji Hospital, The Second Affiliated Hospital of Harbin Medical University, The First Affiliated Hospital of Xi’an Jiaotong University, The Third People’s Hospital of Shenzhen, ZhongDa Hospital, Fujian Provincial Hospital) located in different cities representing distinct geographical locations in mainland China between 2014 and 2017 (Fig. 1A). Sputum quality was assessed by the presence of polymorphonuclear neutrophils (PMNs) and squamous epithelial cells (SECs) per low-power (microscopic) field (LPF) [×10 objective]. Only qualified samples (> 25 PMNs and < 10 s per LPF) were included in the study [48]. The sputum samples were immediately placed into a viral transport medium and stored at -80℃ until transported to the lab for processing (normally within a year).

Patients in this study were diagnosed with CAP through guidelines for the diagnosis and treatment of community-acquired pneumonia [49], meeting inclusion criteria included clinical manifestations of acute infection, respiratory symptoms, inflammatory changes revealed by chest X-rays or computed tomography, and no history of healthcare system exposure. In addition, the study primarily included patients who developed symptoms within 7 days. Patients who had been ill for more than 7 days and experienced a sudden worsening of symptoms during treatment, suggestive of a possible secondary infection, were also included. Cases of pneumonia caused by non-infectious factors were excluded. The severity of the patients was determined following the Guideline of the American Thoracic Society and Infectious Diseases Society of America [50]. Specifically, CAP patients must meet one primary criterion or three secondary criteria to be classified as clinically severe CAP cases. Primary criteria included 1). requirement for invasive mechanical ventilation; 2). presence of septic shock necessitating vasopressor therapy. Secondary criteria were 1). respiratory rate ≥ 30 breaths/minute; 2). PaO2/FiO2 ratio ≤ 250; 3). Multilobar infiltrates; 4). altered mental status or disorientation; 5). Renal dysfunction (blood urea nitrogen level ≥ 20 mg/dL); 6). Leukopenia (white blood cell count < 4 × 10^9/L); 7). Thrombocytopenia (platelet count < 100 × 10^9/L); 8). Hypothermia (core body temperature < 36.0 °C); 9). Hypotension requiring aggressive fluid resuscitation. Patients diagnosed with severe pneumonia at any time during hospitalization are recorded as severe cases.

Statistical analysis

The alpha diversity was calculated by the estimate_richness function in R package phyloseq(v.1.38.0) [51]. Beta diversity represented by Jensen-Shannon Divergence (JSD) distance was calculated by Phyloseq R package (v4.0.3) [51]. Permutational multivariate analysis of variance (PERMANOVA) was used to compare the microbiota composition between different groups [52], p-value was calculated based on 999 permutations. All the possible confounders (variables 1–10 in Table 1) were used for multivariate PERMANOVA. Wilcoxon signed-rank test was used to compare continuous variables in different groups. Fisher’s exact test was used to test the correlation between categorical variables. P-values were adjusted for multiple testing using the Benjamini-Hochberg method.

Additional methods applied in the study were described in the supplementary methods.

Data availability

Raw sequencing data have been deposited in the GSA in the National Genomics Data Center (HRA002709). All statistical analyses were implemented in RStudio and the scripts and data could be accessed at



chronic obstructive pulmonary disease


body-mass index


permutational multivariate analysis of variance


community-acquired pneumonia


negative controls


healthy controls


Jensen–Shannon divergence


principal coordinate analysis


permutational multivariate analysis of variance


linear discriminant analysis effect size


linear discriminant analysis


receiver operator characteristic




sparse inverse covariance estimation for ecological association inference


  1. Shoar S, Musher DM. Etiology of community-acquired pneumonia in adults: a systematic review. Pneumonia. 2020;12(1):11.

    Article  PubMed  PubMed Central  Google Scholar 

  2. Age-sex differences. In the global burden of lower respiratory infections and risk factors, 1990–2019: results from the global burden of Disease Study 2019. Lancet Infect Dis. 2022;22(11):1626–47.

    Article  Google Scholar 

  3. Global burden of 369 diseases and injuries in 204 countries and territories, 1990–2019: a systematic analysis for the Global Burden of Disease Study 2019. Lancet (London, England). 2020;396(10258):1204-22.

  4. Wu BG, Segal LN. The Lung Microbiome and its role in Pneumonia. Clin Chest Med. 2018;39(4):677–89.

    Article  PubMed  PubMed Central  Google Scholar 

  5. Segal LN, Clemente JC, Tsay J-CJ, Koralov SB, Keller BC, Wu BG, et al. Enrichment of the lung microbiome with oral taxa is associated with lung inflammation of a Th17 phenotype. Nat Microbiol. 2016;1(5):16031.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Man WH, van Houten MA, Mérelle ME, Vlieger AM, Chu MLJN, Jansen NJG, et al. Bacterial and viral respiratory tract microbiota and host characteristics in children with lower respiratory tract infections: a matched case-control study. Lancet Respiratory Med. 2019;7(5):417–26.

    Article  Google Scholar 

  7. Shenoy MK, Iwai S, Lin DL, Worodria W, Ayakaka I, Byanyima P, et al. Immune Response and Mortality Risk relate to distinct lung microbiomes in patients with HIV and Pneumonia. Am J Respir Crit Care Med. 2017;195(1):104–14.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Shankar J, Nguyen M-H, Crespo M, Kwak E, Lucas S, McHugh K et al. Looking beyond respiratory cultures: Microbiome-Cytokine signatures of bacterial pneumonia and Tracheobronchitis in Lung Transplant recipients. Am J Transplantation. 2016;16(6):1766-78.

  9. de Steenhuijsen Piters WAA, Huijskens EGW, Wyllie AL, Biesbroek G, van den Bergh MR, Veenhoven RH, et al. Dysbiosis of upper respiratory tract microbiota in elderly pneumonia patients. ISME J. 2016;10(1):97–108.

    Article  PubMed  Google Scholar 

  10. Thibeault C, Suttorp N, Opitz B. The microbiota in pneumonia: from protection to predisposition. Sci Transl Med. 2021;13(576).

  11. Dickson RP, Erb-Downward JR, Huffnagle GB. Towards an ecology of the lung: new conceptual models of pulmonary microbiology and pneumonia pathogenesis. Lancet Respir Med. 2014;2(3):238–46.

    Article  PubMed  PubMed Central  Google Scholar 

  12. Man WH, de Steenhuijsen Piters WAA, Bogaert D. The microbiota of the respiratory tract: gatekeeper to respiratory health. Nat Rev Microbiol. 2017;15(5):259–70.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Adar SD, Huffnagle GB, Curtis JL. The respiratory microbiome: an underappreciated player in the human response to inhaled pollutants? Ann Epidemiol. 2016;26(5):355–9.

    Article  PubMed  PubMed Central  Google Scholar 

  14. Carney SM, Clemente JC, Cox MJ, Dickson RP, Huang YJ, Kitsios GD, et al. Methods in lung Microbiome Research. Am J Respir Cell Mol Biol. 2020;62(3):283–99.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Das S, Bernasconi E, Koutsokera A, Wurlod D-A, Tripathi V, Bonilla-Rosso G, et al. A prevalent and culturable microbiota links ecological balance to clinical stability of the human lung after transplantation. Nat Commun. 2021;12(1):2126.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Mayhew D, Devos N, Lambert C, Brown JR, Clarke SC, Kim VL, et al. Longitudinal profiling of the lung microbiome in the AERIS study demonstrates repeatability of bacterial and eosinophilic COPD exacerbations. Thorax. 2018;73(5):422–30.

    Article  PubMed  Google Scholar 

  17. Frayman KB, Armstrong DS, Carzino R, Ferkol TW, Grimwood K, Storch GA, et al. The lower airway microbiota in early cystic fibrosis lung disease: a longitudinal analysis. Thorax. 2017;72(12):1104–12.

    Article  PubMed  Google Scholar 

  18. Sulaiman I, Chung M, Angel L, Tsay JJ, Wu BG, Yeung ST, et al. Microbial signatures in the lower airways of mechanically ventilated COVID-19 patients associated with poor clinical outcome. Nat Microbiol. 2021;6(10):1245–58.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Zakharkina T, Martin-Loeches I, Matamoros S, Povoa P, Torres A, Kastelijn JB, et al. The dynamics of the pulmonary microbiome during mechanical ventilation in the intensive care unit and the association with occurrence of pneumonia. Thorax. 2017;72(9):803–10.

    Article  PubMed  Google Scholar 

  20. Du S, Shang L, Zou X, Deng X, Sun A, Mu S, et al. Azithromycin exposure induces transient Microbial composition shifts and decreases the Airway Microbiota Resilience from Outdoor PM(2.5) stress in healthy adults: a Randomized, Double-Blind, placebo-controlled trial. Microbiol Spectr. 2023;11(3):e0206622.

    Article  PubMed  Google Scholar 

  21. Cai X, Luo Y, Zhang Y, Lin Y, Wu B, Cao Z, et al. Airway microecology in rifampicin-resistant and rifampicin-sensitive pulmonary tuberculosis patients. BMC Microbiol. 2022;22(1):286.

    Article  PubMed  PubMed Central  Google Scholar 

  22. Lin L, Yi X, Liu H, Meng R, Li S, Liu X, et al. The airway microbiome mediates the interaction between environmental exposure and respiratory health in humans. Nat Med. 2023;29(7):1750–9.

    Article  CAS  PubMed  Google Scholar 

  23. Douglas GM, Maffei VJ, Zaneveld JR, Yurgel SN, Brown JR, Taylor CM, et al. PICRUSt2 for prediction of metagenome functions. Nat Biotechnol. 2020;38(6):685–8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Lippi G, Franchini M. Vitamin K in neonates: facts and myths. Blood Transfus = Trasfusione del sangue. 2011;9(1):4–9.

    PubMed  Google Scholar 

  25. Arslan S, Ugurlu S, Bulut G, Akkurt I. The association between plasma D-dimer levels and community-acquired pneumonia. Clin (Sao Paulo). 2010;65(6):593–7.

    Article  Google Scholar 

  26. Agapakis DI, Tsantilas D, Psarris P, Massa EV, Kotsaftis P, Tziomalos K, et al. Coagulation and inflammation biomarkers may help predict the severity of community-acquired pneumonia. Respirology. 2010;15(5):796–803.

    Article  PubMed  Google Scholar 

  27. Vinolo MA, Rodrigues HG, Nachbar RT, Curi R. Regulation of inflammation by short chain fatty acids. Nutrients. 2011;3(10):858–76.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Furusawa Y, Obata Y, Fukuda S, Endo TA, Nakato G, Takahashi D, et al. Commensal microbe-derived butyrate induces the differentiation of colonic regulatory T cells. Nature. 2013;504(7480):446–50.

    Article  CAS  PubMed  Google Scholar 

  29. Ren L, Zhang R, Rao J, Xiao Y, Zhang Z, Yang B et al. Transcriptionally active lung Microbiome and its Association with bacterial biomass and host inflammatory status. mSystems. 2018;3(5).

  30. Mac Aogáin M, Narayana JK, Tiew PY, Ali N, Yong VFL, Jaggi TK, et al. Integrative microbiomics in bronchiectasis exacerbations. Nat Med. 2021;27(4):688–99.

    Article  PubMed  Google Scholar 

  31. Dickson RP, Schultz MJ, van der Poll T, Schouten LR, Falkowski NR, Luth JE, et al. Lung microbiota predict clinical outcomes in critically ill patients. Am J Respir Crit Care Med. 2020;201(5):555–63.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Garnacho-Montero J, Timsit JF. Managing Acinetobacter baumannii infections. Curr Opin Infect Dis. 2019;32(1):69–76.

    Article  PubMed  Google Scholar 

  33. Kishaba T. Community-Acquired Pneumonia caused by Mycoplasma pneumoniae: how physical and radiological examination contribute to successful diagnosis. Front Med (Lausanne). 2016;3:28.

    PubMed  Google Scholar 

  34. Kanderi T, Shrimanker I, Mansoora Q, Shah K, Yumen A, Komanduri S. Stenotrophomonas maltophilia: an Emerging Pathogen of the respiratory tract. Am J Case Rep. 2020;21:e921466.

    Article  PubMed  PubMed Central  Google Scholar 

  35. Pahal PRV, Rajasurya V, Sharma S. Typical Bacterial Pneumonia. Treasure Island (FL): StatPearls Publishing; 2024.

  36. Martinson JNV, Pinkham NV, Peters GW, Cho H, Heng J, Rauch M, et al. Rethinking gut microbiome residency and the Enterobacteriaceae in healthy human adults. ISME J. 2019;13(9):2306–18.

    Article  PubMed  PubMed Central  Google Scholar 

  37. Dickson RP, Singer BH, Newstead MW, Falkowski NR, Erb-Downward JR, Standiford TJ, et al. Enrichment of the lung microbiome with gut bacteria in sepsis and the acute respiratory distress syndrome. Nat Microbiol. 2016;1(10):16113.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Ren L, Wang Y, Zhong J, Li X, Xiao Y, Li J, et al. Dynamics of the Upper Respiratory Tract Microbiota and its Association with Mortality in COVID-19. Am J Respir Crit Care Med. 2021;204(12):1379–90.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Hung HM, Yang SL, Chen CJ, Chiu CH, Kuo CY, Huang KA, et al. Molecular epidemiology and clinical features of rhinovirus infections among hospitalized patients in a medical center in Taiwan. J Microbiol Immunol Infect. 2019;52(2):233–41.

    Article  PubMed  Google Scholar 

  40. Jacobs SE, Soave R, Shore TB, Satlin MJ, Schuetz AN, Magro C, et al. Human rhinovirus infections of the lower respiratory tract in hematopoietic stem cell transplant recipients. Transpl Infect Disease: Official J Transplantation Soc. 2013;15(5):474–86.

    Article  CAS  Google Scholar 

  41. Zhou Y, Du J, Wu JQ, Zhu QR, Xie MZ, Chen LY, et al. Impact of influenza virus infection on lung microbiome in adults with severe pneumonia. Ann Clin Microbiol Antimicrob. 2023;22(1):43.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Jie F, Wu X, Zhang F, Li J, Liu Z, He Y, et al. Influenza virus infection increases host susceptibility to secondary infection with Pseudomonas aeruginosa, and this is attributed to neutrophil dysfunction through reduced myeloperoxidase activity. Microbiol Spectr. 2023;11(1):e0365522.

    Article  PubMed  Google Scholar 

  43. Liu WJ, Zou R, Hu Y, Zhao M, Quan C, Tan S, et al. Clinical, immunological and bacteriological characteristics of H7N9 patients nosocomially co-infected by Acinetobacter Baumannii: a case control study. BMC Infect Dis. 2018;18(1):664.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Manna S, Baindara P, Mandal SM. Molecular pathogenesis of secondary bacterial infection associated to viral infections including SARS-CoV-2. J Infect Public Health. 2020;13(10):1397–404.

    Article  PubMed  PubMed Central  Google Scholar 

  45. Rogers GB, van der Gast CJ, Cuthbertson L, Thomson SK, Bruce KD, Martin ML, et al. Clinical measures of disease in adult non-CF bronchiectasis correlate with airway microbiota composition. Thorax. 2013;68(8):731.

    Article  PubMed  Google Scholar 

  46. Feng Z-H, Li Q, Liu S-R, Du X-N, Wang C, Nie X-H et al. Comparison of composition and diversity of bacterial Microbiome in Human Upper and Lower Respiratory Tract. Chin Med J. 2017;130(9).

  47. An SQ, Warris A, Turner S. Microbiome characteristics of induced sputum compared to bronchial fluid and upper airway samples. Pediatr Pulmonol. 2018;53(7):921–8.

    Article  PubMed  Google Scholar 

  48. Gal-Oz A, Kassis I, Shprecher H, Beck R, Bentur L. Correlation between Rapid Strip Test and the quality of Sputum. Chest. 2004;126(5):1667–71.

    Article  PubMed  Google Scholar 

  49. He LX. Guidelines for the diagnosis and treatment of community-acquired pneumonia: learning and practicing. Chin J Tuberculosis Respiratory Dis. 2006;29(10):649–50.

    Google Scholar 

  50. Mandell LA, Wunderink RG, Anzueto A, Bartlett JG, Campbell GD, Dean NC, et al. Infectious Diseases Society of America/American Thoracic Society consensus guidelines on the management of community-acquired pneumonia in adults. Clin Infect Dis. 2007;44(Suppl 2):S27–72.

    Article  CAS  PubMed  Google Scholar 

  51. McMurdie PJ, Holmes S. Phyloseq: an R package for reproducible interactive analysis and graphics of microbiome census data. PLoS ONE. 2013;8(4):e61217.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Oksanen J, Blanchet FG, Kindt R, Legendre P, Minchin P, O’Hara RB et al. Vegan: Community Ecology Package. R Package Version. 2.0–10. CRAN. 2013.

Download references


We thank all the participants who donated their specimens to this study.


This study was supported by funding from the National Key R&D Program of China (grant no. 2022YFA1304300 to L.R. and M.L.), Non-profit Central Research Institute Fund of Chinese Academy of Medical Sciences (grant no. 2019PT310029 to L.R.), the Chinese Academy of Medical Sciences (CAMS) Innovation Fund for Medical Sciences (CIFMS) (grant no. 2021-I2M-1-038 to J.W.), the Fundamental Research Funds for the Central Universities (grant no. 3332021092 to J.W.), the National Natural Science Foundation of China (grant no. 32100098 to L.Z.), the Beijing Municipal Natural Science Foundation of China (grant no. Z190017 to L.R.), the Beijing Nova Program (grant no. Z191100006619102 to J.W., Z211100002121135 to L.Z.), and Fondation Merieux (grant no. N/A to J.W.). The funders had no role in the design of this study and did not have any role during its execution, analyses, interpretation of the data, or decision to submit results.

Author information

Authors and Affiliations



L.R., M.L., and J.W. designed the study. J.Y., J.L. and L.F.Z. conducted the experiments, performed the statistical analysis, and drafted the initial draft of the manuscript. Y.X., Y.W., L.C., and X.W. contributed to the process of the specimens. G.Z., M.C., F.C., L.L. participated in the recruitment of subjects and contributed to clinical data acquisition. Z.S., L.Z., Z.W. and L.W. contributed to computational analysis. M.L., L.R., and J.W. revised the manuscript. J.Y., J.L., L.F.Z., L.R., M.L., and J.W. have assessed and verified the underlying data reported in the manuscript. All authors contributed to this article and approved the submitted versions.

Corresponding authors

Correspondence to Jianwei Wang, Mingkun Li or Lili Ren.

Ethics declarations

Ethics approval and consent to participate

The study was approved by the Institutional Review Board of Ethics of hospitals where sampling was conducted and the Institute of Pathogen Biology, Chinese Academy of Medical Sciences (record number: IPB-2018-3), and written informed consent was obtained from each subject before inclusion. The private information of all patients was confidential.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Supplementary Material 2

Supplementary Material 3

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, J., Li, J., Zhang, L. et al. Highly diverse sputum microbiota correlates with the disease severity in patients with community-acquired pneumonia: a longitudinal cohort study. Respir Res 25, 223 (2024).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: