Skip to main content

Machine learning for accurate detection of small airway dysfunction-related respiratory changes: an observational study

Abstract

Background

The use of machine learning(ML) methods would improve the diagnosis of small airway dysfunction(SAD) in subjects with chronic respiratory symptoms and preserved pulmonary function(PPF). This paper evaluated the performance of several ML algorithms associated with the impulse oscillometry(IOS) analysis to aid in the diagnostic of respiratory changes in SAD. We also find out the best configuration for this task.

Methods

IOS and spirometry were measured in 280 subjects, including a healthy control group (n = 78), a group with normal spirometry (n = 158) and a group with abnormal spirometry (n = 44). Various supervised machine learning (ML) algorithms and feature selection strategies were examined, such as Support Vector Machines (SVM), Random Forests (RF), Adaptive Boosting (ADABOOST), Navie Bayesian (BAYES), and K-Nearest Neighbors (KNN).

Results

The first experiment of this study demonstrated that the best oscillometric parameter (BOP) was R5, with an AUC value of 0.642, when comparing a healthy control group(CG) with patients in the group without lung volume-defined SAD(PPFN). The AUC value of BOP in the control group was 0.769 compared with patients with spirometry defined SAD(PPFA) in the PPF population. In the second experiment, the ML technique was used. In CGvsPPFN, RF and ADABOOST had the best diagnostic results (AUC = 0.914, 0.915), with significantly higher accuracy compared to BOP (p < 0.01). In CGvsPPFA, RF and ADABOOST had the best diagnostic results (AUC = 0.951, 0.971) and significantly higher diagnostic accuracy (p < 0.01). In the third, fourth and fifth experiments, different feature selection techniques allowed us to find the best IOS parameters (R5, (R5-R20)/R5 and Fres). The results demonstrate that the performance of ADABOOST remained essentially unaltered following the application of the feature selector, whereas the diagnostic accuracy of the remaining four classifiers (RF, SVM, BAYES, and KNN) is marginally enhanced.

Conclusions

IOS combined with ML algorithms provide a new method for diagnosing SAD in subjects with chronic respiratory symptoms and PPF. The present study’s findings provide evidence that this combination may help in the early diagnosis of respiratory changes in these patients.

Background

Two major chronic respiratory disorders that can affect the small airways include asthma and chronic obstructive pulmonary disease (COPD). Evidence from prospective studies indicates that asthma and COPD may occur before small airway dysfunction (SAD) [1,2,3]. Symptoms of COPD and asthma include coughing, producing phlegm, dyspnea, and wheezing. The following symptoms may indicate SAD in some subjects: negative airway hyperresponsiveness (AHR) or bronchial reversibility (BR), which means the subject does not meet the pulmonary function criteria for COPD or asthma, and preserved pulmonary function (PPF, forced expiratory volume in 1 s (FEV1)/forced vital capacity (FVC) ratio ≥ 0.70 [4]). According to a large-scale multi-stage stratified sampling survey, about 40% of Chinese individuals 20 years of age and older have spirometrically characterized SAD [5]. Owing to the severe impact of SAD, it was crucial to identify and treat the condition early.

The “quiet zone” is made up of small airways (with an inner diameter of less than 2 mm), which had a huge cross-sectional area and contribute very little to the total airway resistance. [6] In clinical practice, spirometry was the most widely used technique to assess small-airway function. The parameters that were employed include FVC50% (FEF50%), FVC75% at expiration (FEF75%), and forced expiratory flow between 25% and 75% of FVC (FEF25–75%). At least two of the three small airway markers (FEF25–75%, FEF50%, and FEF75%) had a projected value of less than 65%, which was the definition of spirometry SAD [5]. However, spirometry requires good cooperation of subjects, and the great variability of values makes its reliability not universally accepted [7, 8]. An approach to measuring respiratory impedance based on the forced oscillation technique (FOT) is called impulse oscillometry (IOS). All that is needed for the IOS measurement is quiet tidal breathing, which is easy to do, appropriate for a broad spectrum of individuals, and yields a variety of respiratory physiological data. IOS is able to measure the respiratory mechanics during quiet tidal breathing, which sets it apart from spirometry. Because of externally overlaid oscillatory signals, it is independent of subject effort, unlike spirometry [9]. Furthermore, it appears to correlate better with small airway features and may be more sensitive in identifying SAD [10,11,12]. Since IOS can reflect the viscosity of the respiratory system through electrical resistance (RRS) and the elastic and inertial properties of the respiratory system through reactance (XRS), it can be combined with spirometry to gain more insight into individual pathological changes.

IOS was not currently frequently utilized in pulmonary function assessment, though. This approach’s drawback stems from the fact that it relied on electrical engineering ideas, which might be challenging to interpret in a clinical context. Another important consideration is the expensive inspection apparatus. Therefore, even though the IOS test is straightforward, a busy, inexperienced pulmonary function technician or primary care physician would find it challenging to interpret the resistance and reactance curves, as well as the derived values, without proper training and expertise. Furthermore, the analysis is challenging due to the findings for the IOS test values being dispersed. Consequently, machine learning (ML)-based computer-aided decision systems can enhance the functionality of IOS and support physicians in strengthening the diagnosis, monitoring, and treatment of chronic respiratory disorders, such as asthma and COPD.

In this context, we hypothesized that the use of ML methods in combination with IOS test would improve the diagnosis of small airway function in PPF populations. This study aims to evaluate the performance of several ML algorithms in diagnosing SAD in PPF population, and to find the best configuration.

Materials and methods

Study population

This was a single-centered, observational study in the Pulmonary Function Laboratory of West China Hospital, Sichuan University. Subjects were recruited and tested from May 1st to September 1st, 2020.

Included were adult patients undergoing pulmonary function tests as a result of persistent respiratory complaints. In addition, participants must meet the PPF requirements (FEV1/FVC ≥ 0.70) [4]. The following conditions had to be met in order to be excluded: restrictive pulmonary diseases (FVC < 80% predicted), asthma, interstitial lung diseases, lung cancer, respiratory infection within two weeks, myocardial ischemia, history of pulmonary surgery, and incomplete IOS due to tongue position errors, vocal cord closures, or swallowing. As healthy controls, we also enrolled never-smokers (those with ≤ 1 pack-year of tobacco smoking history) with a normal chest radiograph, no active pulmonary conditions, and no unstable cardiovascular disorders. Basic demographic data was gathered, such as height, weight, age, sex, and body mass index (BMI). Subjects received IOS, spirometry, and completed a questionnaire covering qualitative and quantitative evaluation of symptoms. Also, bronchial provocation tests or bronchodilator tests were performed to exclude asthma. The study was approved by the ethics committee of West China Hospital, Sichuan University, and all participants signed an informed consent before the procedure.

Impulse oscillometry and parameters

In accordance with ERS guidelines, the respiratory resistance and reactance were measured using IOS equipment (MS-IOS Jaeger) [9]. Because forced expiration may alter airway tone, IOS was performed prior to spirometry [13]. Pressure oscillations generated by a loudspeaker were superimposed onto normal tidal breathing through a mouthpiece for 30 to 45 s, which ranged from 5 to 35 Hz in frequency. Sitting upright, subjects were asked to wear a nasal clip and exert manual compression on their faces to minimize the influence of cheek vibration and air leak.

The IOS parameters selected in this paper and their clinical significance are as follows:

(1) Respiratory resistance at 5 Hz (R5): reflects the total viscous resistance of the respiratory system, because it is mainly airway resistance, also known as total airway resistance.

(2) Respiratory resistance at 20 Hz (R20): reflects central airway resistance.

(3) The difference between R5 and R20 (R5–R20): reflects the frequency dependence of resistance, that is, peripheral airway resistance. That is, the change of respiratory system resistance when the oscillation frequency is gradually increased.

(4) (R5-R20)/R5(%): the ratio of peripheral airway resistance to total airway resistance.

(5) Reactance at 5 Hz (X5): reflects the total elastic resistance of the respiratory system. Because the elastic resistance of the lung and thorax is the main one, it is often called peripheral elastic resistance, and also includes gas compression in the airway and alveoli. X5 is generally negative, with higher negative values indicating greater elastic resistance.

(6) Reactance area (AX): The area enclosed by the Xrs f frequency curve between 5 Hz and Fres and the horizontal 0 axis. AX is the integration of the low frequency reactance.

(7) Resonant frequency (Fres): The inertial resistance and elastic resistance are in opposite directions. When the two are equal and cancel each other, the reactance of the respiratory system is zero.

Spirometry and parameters

Spirometry was performed by a full MasterScreen PFT System (Jaeger Corp. Germany) according to the American Thoracic Society (ATS)/European Respiratory Society (ERS) guidelines [14]. FEV1, FVC, FEV1/ FVC, FEF25–75%, FEF50% and FEF75% were recorded as percentages of predicted values. The prediction equations are based on a large study of normal spirometry values in Chinese aged 4–80 years, which is recommended in the spirometry guideline in China [15].

Data sets

The data collection used for the experiments included measurements from 280 participant groups. The data set contained information from the volunteers’ IOS test and lung function in addition to biological data like age, sex, height, and weight. The PPF patients without SAD (PPFN group) contributed 158 sets, the PPF patients with SAD (PPFA group) contributed 44 sets, and the healthy control group (CG group) contributed 78 sets. Using random sampling, the data set is split into training and test sets in a 7:3 ratio. All of the given results were from test sets. The adjustment of the hyperparameters was obtained by manual tuning, taking the hyperparameter with the best average result.

The studied classifiers

The discrete data measured by IOS can be thoroughly analyzed by ML algorithms to identify potential relationships. These ML algorithms were assessed in this study based on the findings of earlier research and pre-experiments:

(1) Random forests: A method of decision tree analysis in which a supervised algorithm works through “bagging” approach to create multiple decision trees with a random subset of the data. These decision trees are then merged to get a more accurate and stable prediction [16].

(2) Support vector machine: A supervised ML algorithm that classifies data points by finding the optimal hyperplane that maximally separates different classes in a high-dimensional space [17].

(3) Naive Bayes: A probabilistic classifier based on Bayes’ theorem [18].

(4) Adaptive Boosting (ADABOOST): A statistical classification algorithm that is frequently used with other “weaker” ML algorithms (e.g., decision tree) to improve their performance. [19]

(5) K-Nearest Neighbor (KNN): A common unsupervised ML method, in which unsupervised algorithms aim to group input vectors into k clusters based on k averages of points (i.e., centroids) without referring to known, or labeled outcomes [20].

In addition, this study conducted feature selection and investigated the use of SelectKBest, RFECV, and SelectFromModel algorithms in this experiment in order to find IOS parameters with a better correlation with the experimental results and minimize the complexity of the experimental data set.

(6) SelectKBest : A feature selection method based on statistical tests, which selects K features that are most relevant to the target variable according to some evaluation index. [21]

(7) RFECV: A Feature selection method in scikit-learn that combines Recursive Feature Elimination (RFE) and Cross-Validation (CV) to select the best feature subset [22].

(8) SelectFromModel: A feature selection method in scikit-learn, which selects the most relevant features based on the feature importance of the supervised learning model. [23]

Experiment design

This study involved the conduct of five experiments.

The first experiment’s goal was to assess each IOS parameter’s capacity to identify SAD in patients with PPF. The study’s criteria for diagnosing SAD were two out of the three small airway measurements (FEF25-75%, FEF50%, and FEF75%) having a predictive value of less than 65% according to spirometry. We examined two distinct scenarios: control versus PPF patients without SAD (CGvsPPFN) and control versus PPF patients with SAD (CGvsPPFA) in order to accurately assess the degree of airway blockage in patients with PPF. The two situations described were likewise assessed in the remaining studies.

The second experiment employed the ML algorithm and compared it to the results obtained using a single IOS parameter to ascertain whether the ML algorithm could achieve superior performance. The area under the ROC curve (AUC) was then selected as the performance evaluation metric. All IOS parameter characteristics for this experiment were included in the selection process.

In the third experiment, the effectiveness of SelectKBest as a feature selector for lowering complexity and determining the significance of various IOS parameters was evaluated. Five classifiers were used for training once SelectKBest had chosen the IOS parameters.

In the fourth and fifth experiments, two model-dependent feature selection algorithms were employed to investigate the significance of the 7 IOS feature parameters in this study.Recursive Feature Elimination with Cross-Validation, or RFECV, was used in Experiment 4. RFECV fits a machine learning model to data, ranks features according to their weights or importance, recursively removes the least important features, and uses cross-validation to assess model performance in each iteration. RFECV creates a performance curve by recording the results of varying numbers of features removed in each round. Using SelectFromModel, the most pertinent characteristics were chosen in Experiment 5 based on the significance of the features in a supervised learning model. To increase model efficiency and generalization while preserving important information, the technique selects features over a threshold, computes feature importance scores, trains a supervised learning model, and then generates a new feature set.

Hypothesis testing is necessary to contrast ML algorithms. A wide variety of parametric tests are available, often based on t-tests. The Wilcoxon Rank-Sum Test, the Kruskal-Wallis Test, and the Mann-Whitney U Test are a few of the most often used nonparametric tests [24,25,26]. We used the permutation test to do hypothesis testing of AUCs in this work. [27, 28].

Results

Table 1 displays the individuals’ biological parameters, spirometry results, chronic respiratory complaints, and IOS data. There was no discernible difference between any of the three research groups’ biological characteristics. There was no discernible difference in symptoms between the groups with and without spirometer-defined SAD for individuals with persistent respiratory symptoms. PPFA patients exhibited considerably lower spirometry parameters (p < 0.05), as Table 1 illustrates.

Table 1 Characteristics of healthy controls and subjects with and without spirometry-SAD

(The last column describes the comparisons between groups, in which the dot means non-significant change, while the dash means significant change.)

Figure 1’s bar graphs display the distinct features of the IOS parameters for the CG, PPFN, and PPFA groups. The majority of IOS parameters were substantially different (p < 0.05) across the three groups, according to the analysis of variance (ANOVA). PPF patients showed higher R5 and R20 when compared to healthy people. PPF patients consequently had greater airway resistance. In the meantime, patients with SAD in the PPF group showed greater values of R5, R5-R20, AXV, and Fres. The three groups’ R5-R20/R5 and X5 levels were comparable.

Fig. 1
figure 1

Comparison of IOS parameters among the three groups. Bar charts represented Mean + SD (M + SD). * indicates that there is a statistically significant difference comparing to each IOS parameter for each group. * P < 0.05, ** P < 0.01, *** P < 0.001, **** P < 0.0001

The first experiment: diagnostic accuracy of IOS parameters.

Figure 2 presents the findings from Experiment 1. As can be observed, R5 was the best IOS parameter (BOP) for PPF patient diagnosis, with moderate diagnostic accuracy (AUC = 0.642, AUC = 0.769) for CG vs. PPFN and CG vs. PPFA scenarios.

Fig. 2
figure 2

Results of experiment 1, describing the diagnostic accuracy of Impulse oscillometry in subjects with chronic respiratory symptoms and preserved pulmonary function. More detailed tables and graphs regarding these results are available in the Additional file.(Additional file Figure S1)

The second experiment of the study: diagnostic accuracy of the original IOS parameters associated with ML techniques.

Figure 3 presents the AUCs of the BOP, ML algorithm, and MIL classifier obtained in Experiment 2. It can be seen that the ML algorithm improves the AUC with high diagnostic accuracy in both cases, CGvsPPFN and CGvsPPFA. In the CGvsPPFN scenario, ADABOOST (AUC = 0.915) had the best performance, followed by RF (AUC = 0.914). Compared with BOP, RF, SVM, ADABOOST and KNN showed statistical differences. In the CGvsPPFA scenario, ADABOOST (AUC = 0.971) had the best performance, followed by RF (AUC = 0.951). Compared with BOP, RF, SVM, ADABOOST and KNN showed statistical differences.

Fig. 3
figure 3

Results of experiment 2, describing the diagnostic accuracy of Impulse oscillometry with ML algorithms in subjects with chronic respiratory symptoms and preserved pulmonary function. Also, * indicates that there a statistically significant difference comparing to BOP (p < 0.05). * P < 0.05, ** P < 0.01. More detailed tables and graphs regarding these results are available in the Additional file.(Additional file Figure S2-S3)

The third experiment: diagnostic accuracy of the best original IOS parameters associated with ML techniques.

The IOS parameters used for the two cases, CGvsPPFN and CGvsPPFA, respectively, utilizing SelectKBest as the feature selector, are shown in Table 2.

Experiments 2 and 3 had superior AUC outcomes, as shown by the data in Fig. 4. A similar pattern was seen in both cases when SelectKBest was used as the feature selector: as the number of features increased, the ML algorithm’s performance improved over time. When choosing 3/5 IOS feature parameters, the AUC value decreased slightly, but overall, the diagnostic performance was still better than BOP.

Table 2 The best IOS parameters in SelectKBest
Fig. 4
figure 4

Summary of Experiment 2 and Experiment 3 (SelectKBest as a feature selector)—AUCs for the best oscillometric parameter (BOP), for the best ML algorithms in experiments 3, and the best ML algorithm with oscillometric parameters (ADABOOST). The figure indicates the best ML algorithm in each case. Also, * indicates that there a statistically significant difference comparing to BOP (p < 0.05). * P < 0.05, ** P < 0.01. More detailed tables and graphs regarding these results are available in the Additional file.(Additional file Figure S4-S7)

The fourth and fifth experiment: diagnostic accuracy of the IOS parameters associated with ML techniques.

The best AUC findings for Experiments 4 and 5 are shown in Fig. 5. When compared to the full parameter, the IOS feature parameter’s diagnostic performance tends to be similar in both situations and to hold onto a high diagnostic value following feature selection.

The task configurations for each ML method classifier with the best performance across all experiments were summarized in Tables 3 and 4. In the two scenarios of CGvsPPFN and CGvsPPFA, among them, RF, SVM, ADABOOST, and KNN may increase the AUC, and the difference was statistically significant. Furthermore, The sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) of various individual ML classifiers are also reported.

Fig. 5
figure 5

Summary of Experiment 4 and Experiment 5—AUCs for the best oscillometric parameter (BOP), for the best ML algorithms in experiments 4 and 5, and the best ML algorithm with oscillometric parameters. The figure indicates the best ML algorithm in each case. Also, * indicates that there a statistically significant difference comparing to BOP (p < 0.05). * P < 0.05, ** P < 0.01.More detailed tables and graphs regarding these results are available in the Additional file.(Additional file Figure S8-S19)

Table 3 The best configuration for CG VS PPFN
Table 4 The best configuration for CG VS PPFA

Discussion

For the purpose of early screening and treatment of respiratory disorders, a number of chronic respiratory disease guidelines, including GINA 2023 and GOLD 2024, advise early monitoring of changes in small airway function. In our previous study, we found that IOS is more sensitive to detect SAD than spirometry in subjects with chronic respiratory symptoms and PPF, and it correlates better with symptoms. IOS could be an additional method for SAD detection in the early stage of diseases [29]. Other similar research has demonstrated the usefulness of small airway function monitoring with IOS for clinical diagnosis [30,31,32]. We found only four correlated IOS parameters, including R5, R5-R20, AX, and Fres, which had low diagnostic efficacy, with none of the AUC values exceeding 0.7.

In order to facilitate the diagnosis of respiratory disorders, this study describes the design of a classifier for SAD diseases in the PPF population.By using machine learning approaches, this work aims to improve the diagnostic value of IOS for small airway dysfunction. Additionally, the best set of parameters and algorithms for this task was determined. Compared to a single IOS measure, the results show that this approach increases diagnostic accuracy and streamlines the clinical assessment of IOS.

Similar to our previous study, we found that R5 had the best AUC value, better sensitivity and slightly lower specificity among all parameters. After the introduction of the machine learning algorithm, the AUC, sensitivity, and specificity of the prediction model were very significantly improved.The best performance in both CGvsPPFN and CGvsPPFA scenarios was achieved by R5, which was the single IOS parameter used in the first experiment. The finding supports the presence of elevated airway resistance in patients with SAD, as measured by various methods including CT scans and bronchoscopy. It is important to note that these results are based on objective measurements rather than subjective evaluations [33, 34].

In the first case, it was more challenging to differentiate the control group from the patients with PPF who had preserved lung function. This was due to the small differences in IOS parameters. The AUC value was 0.642, indicating low diagnostic accuracy. In the second case, the increase in physiological abnormalities resulted in a greater difference in measured parameters, enabling R5 to easily distinguish between the two groups with an AUC of 0.769. These findings suggest that a single IOS parameter may not be sufficient to accurately identify the SAD situation in the PPF population.

The diagnostic accuracy was significantly enhanced through the utilization of RF, SVM, BAYES, ADABOOST, and KNN algorithms. It is clear that ADABOOST and RF produced the most favorable results followed by KNN, SVM and BYS.This breakthrough is mainly due to the use of ML algorithms.Similar to earlier research [35,36,37,38], feature selection permits the use of fewer characteristics without appreciably lowering performance. When SelectKBest was employed as a feature selector, the 3/5 relevant features were selected, respectively. Despite the final trend indicating that the results are superior when more parameters are used, the difference between using the least and most parameters is relatively minor. Furthermore, the results are superior when using the least parameters than when using BOP alone. This implies that feature selection can in fact result in good diagnostic value (AUC 0.948 and 0.967, respectively) with fewer IOS parameters. The most pertinent features are found through feature selection in both the CGvsPPFN and CGvsPPFA scenarios. Despite the fact that the approach only chose two sets of features, R20 and Fres had a significant intersection. This intersection is slightly different from the results of the ability of each single IOS parameter to diagnose SAD in patients with PPF, showing better diagnostic ability for R5 when using a single parameter. This suggests resonant frequency and central airway resistance, in addition to total airway resistance, have a significant role in the increased airway blockage observed in the PPF population.

Compared to the conventional classifier SelectFromModel, the RFECV method may produce superior results and has an efficient selection capability. While it does not increase the accuracy of diagnosis, it does display significant traits like R5, (R5-R20)/R5, and Fres. Feature selection was done to make the analysis easier to understand. We were able to discriminate between groups with clarity by using these three essential criteria. These results support the idea of a simple diagnostic model that can help explain the suggested medical decision support system’s findings and make it easier to apply in clinical settings.

Recent studies have shown that IOS is considered the most advanced technique for lung function analysis and is one of the most promising emerging techniques in the field [29, 39,40,41]. Despite its advantages in providing detailed and direct examination, IOS has not yet been widely used. However, because interpreting the metrics—which are based on electrical modeling—requires knowledge and experience, their application is restricted. This study shows how ML algorithms can improve the diagnosis of associated diseases and simplify the use of IOS, therefore improving healthcare for patients with SAD.

Early detection of abnormal respiratory changes in SAD can facilitate timely interventions that may limit disease progression, alleviate adverse symptoms, improve overall health, prevent complications and comorbidities, and reduce premature mortality [5, 42]. Since the 1980s, lung function analysis has been improved by artificial intelligence and machine learning techniques [43,44,45,46,47,48]. The present work expands on previous results by demonstrating that early aberrant respiratory alterations in SAD may be suggested by a combination of IOS measures and a clinical decision support system based on ML technology.

The algorithm presented in this work can be applied not just to SAD but to a variety of other conditions, including asthma, COPD, interstitial lung disease, and others. By establishing appropriate models and finding the best parameters, the relationship between physiological parameters and the development of the disease can be explored. This benefits the early screening of other respiratory diseases and the reduction of the disease burden on patients.

Clinical technology-wise, more thorough information can be obtained by combining IOS with other imaging modalities (such as MRI, CT, PET, etc.) and by developing real-time imaging technology and dynamic observation techniques. More information for clinical diagnosis and scientific study will be available with the improvement of image contrast and anatomical detail. [49] Concurrently, artificial intelligence and machine learning are integrated to analyse and interpret multiple data types, enhance the accuracy and credibility of clinical examination results, and develop automated and intelligent analysis tools. Encouraging data sharing and IOS standardization, creating a platform for data sharing and standardizing data formats, facilitating multi-center data comparison and analysis, and promoting the field’s progress are all crucial in the context of big data [50].

Finally, it is important to consider and clarify some significant limitations. Firstly, this study is limited to the Chinese population in a specific location. Therefore, it is not possible to ensure its generalisability to different populations. It is recommended that future studies investigate multi-centre data to expand the generalisability of the findings. The experimental design of this work followed globally recognised inclusion and exclusion criteria and was conducted in a typical clinical setting.

Additionally, it is important to note that the PPF population in China is relatively small due to low public health awareness. Many individuals do not seek medical attention promptly when experiencing clinical symptoms such as cough and chest tightness. Therefore, due to the relatively small size of the available dataset, it is necessary to carefully control the complexity of the ML model. In addition to the measures taken in this study to avoid overfitting, such as controlling hyperparameters, feature selection can also aid in controlling overfitting by reducing inputs. Another reason for using feature selection is that a smaller number of features can help simplify the analysis. Furthermore, utilising only three features enables the visualisation of group separation, aiding diagnostic interpretation.

Conclusions

In this work, a variety of machine learning algorithms were utilized to create a clinical auxiliary diagnosis system that can identify respiratory anomalies in patients with PPF. In the initial disease stage (CGvsPPFN), respiratory oscillation parameters achieved low diagnostic accuracy (AUC = 0.642), but ML classifiers significantly improved accuracy (AUC ≥ 0.9). In the progressive disease stage (CGvsPPFA), using oscillation parameters alone yielded moderate accuracy (AUC = 0.769), while ML algorithms greatly enhanced accuracy (AUC ≥ 0.9). The developed diagnostic system simplifies IOS application in PPF patients, utilizing key IOS parameters identified through feature selection. All things considered, combining ML algorithms with IOS examination improves pulmonary function assessment in PPF patients, indicating future improvements in patient care.

Data availability

No datasets were generated or analysed during the current study.

Abbreviations

COPD:

Chronic obstructive pulmonary disease

SAD:

Small airway dysfunction

PPF:

Preserved pulmonary function

ML:

Machine learning

RF:

Random Forests

SVM:

Support Vector Machines

BAYES:

Navie Bayesian

ADABOOST:

Adaptive Boosting

KNN:

K-Nearest Neighbors

FEV1:

The forced expiratory volume in 1st s

FVC:

Forced vital capacity

AHR:

Airway hyper-responsiveness

BR:

Bronchial reversibility

FEF25–75%:

The forced expiratory flow between 25 and 75% of FVC; FEF50%:The forced expiratory flow when 50% of FVC has been exhaled

FEF75%:

The forced expiratory flow when 75% of FVC has been exhaled

IOS:

Impulse oscillometry

Rrs:

Resistance of the respiratory system

Xrs:

Reactance of the respiratory system

R5:

Respiratory resistance at 5 Hz

R20:

Respiratory resistance at 20 Hz

R5–R20:

The difference between R5 and R20

X5:

Reactance at 5 Hz

Fres:

Resonant Frequency

AX:

Area under reactance curve between Fres and5 Hz

BOP:

Best Oscillometric Parameter

ATS:

American Thoracic Society

ERS:

Europe Respiratory Society

ROC:

Receiver Operator Characteristic

AUC:

Area Under the Curve

PPV:

Positive Predictive Value

NPV:

Negative Predictive Value

BMI:

Body Mass Index

References

  1. Stockley JA, Ismail AM, Hughes SM, Edgar R, Stockley RA, Sapey E. Maxi–Mal mid-expiratory flow detects early lung disease in α(1)-antitrypsin deficiency. Eur Respir J. 2017;49:1602055.

    Article  PubMed  Google Scholar 

  2. Schroeder JD, McKenzie AS, Zach JA, Wilson CG, Curran-Everett D, Stinson DS, Newell JD Jr, Lynch DA. Relationships between airflow obstruction and quantitative CT measurements of emphysema, air trapping, and airways in subjects with and without chronic obstructive pulmonary disease. AJR Am J Roentgenol. 2013;201:W460–470.

    Article  PubMed  PubMed Central  Google Scholar 

  3. Skylogianni E, Triga M, Douros K, Bolis K, Priftis KN, Fouzas S. Anthraco–Poulos MB. Small-airway dysfunction precedes the development of asthma in children with allergic rhinitis. Allergol Immunopathol (Madr). 2018;46:313–21.

    Article  CAS  PubMed  Google Scholar 

  4. Woodruff PG, Barr RG, Bleecker E, Christenson SA, Couper D, Curtis JL, Gouskova NA, Hansel NN, Hoffman EA, Kanner RE, et al. Clinical significance of symptoms in smokers with preserved pulmonary function. NEngl J Med. 2016;374:1811–21.

    Article  CAS  Google Scholar 

  5. Xiao D, Chen Z, Wu S, Huang K, Xu J, Yang L, Xu Y, Zhang X, Bai C, Kang J, et al. Prevalence and risk factors of small airway dysfunction, and association with smoking, in China: findings from a national cross-sectional study. Lancet Respir Med. 2020;8:1081–93.

    Article  PubMed  Google Scholar 

  6. Burgel PR, Bergeron A, de Blic J, et al. Small airways diseases, excluding asthma and COPD: an overview. Eur Respir Rev. 2013;22(128):131–47. https://doi.org/10.1183/09059180.00001313.

    Article  PubMed  PubMed Central  Google Scholar 

  7. Konstantinos Katsoulis K, Kostikas K, Kontakiotis T. Techniques for assessing small airways function: possible applications in asthma and COPD. Respir Med. 2016;119:e2–9.

    Article  CAS  PubMed  Google Scholar 

  8. Contoli M, Bousquet J, Fabbri LM, Magnussen H, Rabe KF, Siafakas NM, Hamid Q, Kraft M. The small airways and distal lung compartment in asthma and COPD: a time for reappraisal. Allergy. 2010;65:141–51.

    Article  CAS  PubMed  Google Scholar 

  9. King GG, Bates J, Berger KI, Calverley P, de Melo PL, Dellaca RL, Farre R, Hall GL, Ioan I, Irvin CG, et al. Technical standards for respiratory oscillometry. Eur Respir J. 2020;55:1900753.

    Article  PubMed  Google Scholar 

  10. Skloot G, Goldman M, Fischler D, Goldman C, Schechter C, Levin S, Teirstein A. Respiratory symptoms and physiologic assessment of ironworkers at the World Trade Center disaster site. Chest. 2004;125:1248–55.

    Article  PubMed  Google Scholar 

  11. Oppenheimer BW, Goldring RM, Herberg ME, Hofer IS, Reyfman PA, Liautaud S, Rom WN, Reibman J, Berger KI. Distal airway function in symptomatic subjects with normal spirometry following World Trade Center dust exposure. Chest. 2007;132:1275–82.

    Article  PubMed  Google Scholar 

  12. Su ZQ, Guan WJ, Li SY, Ding M, Chen Y, Jiang M, Chen XB, Zhong CH, Tang CL, Zhong NS. Significances of spirometry and impulse oscillometry for detecting small airway disorders assessed with endobronchial optical coherence tomography in COPD. Int J Chron Obstruct Pulmon Dis. 2018;13:3031–44.

    Article  PubMed  PubMed Central  Google Scholar 

  13. Williamson PA, Clearie K, Menzies D, Vaidyanathan S, Lipworth BJ. Assessment of small-airways disease using alveolar nitric oxide and impulse oscillometry in asthma and COPD. Lung. 2011;189:121–9.

    Article  PubMed  Google Scholar 

  14. Graham BL, Steenbruggen I, Miller MR, et al. Standardization of spirometry 2019 update. An official American thoracic society and European respiratory society technical statement[J]. Am J Respir Crit Care Med. 2019;200(8):e70–88.

    Article  PubMed  PubMed Central  Google Scholar 

  15. Jian W, Gao Y, Hao C, et al. Reference values for spirometry in Chinese aged 4–80 years[J]. J Thorac Disease. 2017;9(11):4538.

    Article  Google Scholar 

  16. BREIMAN L. Random forests. ML, 2001, 45: 5–32.

  17. CORTES C. Vladimir. Support-vector networks. ML. 1995;20:273–97.

    Google Scholar 

  18. MANNING, RAGHAVAN CD. Prabhakar; SCHÜTZE, Hinriche. Xml retrieval. Introduction to Information Retrieval; 2008.

  19. SCHAPIRE FREUNDY, Robert E. A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci. 1997;55(1):119–39.

    Article  Google Scholar 

  20. COVER T, HART P. Nearest neighbor pattern classification. IEEE Trans Inf Theory. 1967;13(1):21–7.

    Article  Google Scholar 

  21. PEDREGOSA F, et al. Scikit-learn: ML in Python. J ML Res. 2011;12:2825–30.

    Google Scholar 

  22. GUYON, Isabelle, et al. Gene selection for cancer classification using support vector machines. ML. 2002;46:389–422.

    Google Scholar 

  23. GUYON I, ELISSEEFF André. An introduction to variable and feature selection. J ML Res. 2003;3:1157–82.

    Google Scholar 

  24. WILCOXON F. Individual comparisons by ranking methods. Breakthroughs in statistics: methodology and distribution. New York, NY: Springer New York; 1992. pp. 196–202.

    Chapter  Google Scholar 

  25. WALLIS KRUSKALWH, Allen W. Use of ranks in one-criterion variance analysis. J Am Stat Assoc, 1952, 583–621.

  26. MANN, WHITNEY HB, Donald R. On a test of whether one of two random variables is stochastically larger than the other. Ann Math Stat, 1947, 50–60.

  27. Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982;143(1):29–36.

    Article  CAS  PubMed  Google Scholar 

  28. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach[J]. Biometrics, 1988: 837–45.

  29. Li LY, Yan TS, Yang J, et al. Impulse oscillometry for detection of small airway dysfunction in subjects with chronic respiratory symptoms and preserved pulmonary function[J]. Respir Res. 2021;22:1–10.

    Article  PubMed  PubMed Central  Google Scholar 

  30. Chiu HY, Hsiao YH, Su KC, Lee YC, Ko HK, Perng DW. Small Airway Dysfunction by Impulse Oscillometry in symptomatic patients with preserved pulmonary function. J Allergy Clin Immunol Pract. 2020;8(1):229–e2353. https://doi.org/10.1016/j.jaip.2019.06.035.

    Article  PubMed  Google Scholar 

  31. Crisafulli E, Pisi R, Aiello M, et al. Prevalence of small-airway dysfunction among COPD patients with different GOLD stages and its role in the impact of Disease. Respiration. 2017;93(1):32–41. https://doi.org/10.1159/000452479.

    Article  PubMed  Google Scholar 

  32. Anderson WJ, Zajda E, Lipworth BJ. Are we overlooking persistent small airways dysfunction in community-managed asthma? Ann Allergy Asthma Immunol. 2012;109(3):185–e1892. https://doi.org/10.1016/j.anai.2012.06.022.

    Article  PubMed  Google Scholar 

  33. Postma DS, Brightling C, Baldi S, et al. Exploring the relevance and extent of small airways dysfunction in asthma (ATLANTIS): baseline data from a prospective cohort study[J]. Lancet Respiratory Med. 2019;7(5):402–16.

    Article  Google Scholar 

  34. McNulty W, Usmani OS. Techniques of assessing small airways dysfunction[J]. Eur Clin Respiratory J. 2014;1(1):25898.

    Article  Google Scholar 

  35. Amaral JLM, Sancho AG, Faria ACD, Lopes AJ, Melo PL. Differential diagnosis of asthma and restrictive respiratory diseases by combining forced oscillation measurements, machine learning and neuro-fuzzy classifiers. Med Biol Eng Comput. 2020;58(10):2455–73.

    Article  PubMed  Google Scholar 

  36. Amaral JLM, Lopes AJ, Jansen JM, Faria ACD, Melo PL. Machine learning algorithms and forced oscillation measurements applied to the automatic identification of chronic obstructive pulmonary disease. Comput Methods Programs Biomed. 2012;105(3):183–93.

    Article  PubMed  Google Scholar 

  37. Amaral JLM, Lopes AJ, Veiga J, et al. High-accuracy detection of airway obstruction in asthma using machine learning algorithms and forced oscillation measurements[J]. Comput Methods Programs Biomed. 2017;144:113–25.

    Article  PubMed  Google Scholar 

  38. Andrade DSM, Ribeiro LM, Lopes AJ, et al. Machine learning associated with respiratory oscillometry: a computer-aided diagnosis system for the detection of respiratory abnormalities in systemic sclerosis[J]. Biomed Eng Online. 2021;20(1):1–18.

    Article  Google Scholar 

  39. Bednarek M, Grabicki M, Piorunek T, et al. Current place of impulse oscillometry in the assessment of pulmonary diseases[J]. Respir Med. 2020;170:105952.

    Article  PubMed  Google Scholar 

  40. Sarkar S, Jadhav U, Ghewade B et al. Oscillometry in lung function Assessment: a Comprehensive Review of Current insights and Challenges[J]. Cureus, 2023, 15(10).

  41. Avila N, Nazeran H, Gordillo N, et al. Computer-aided classification of small airways dysfunction using impulse oscillometric features: a children-focused review[J]. Biomedical Engineering/Biomedizinische Technik. 2020;65(2):121–31.

    Article  PubMed  Google Scholar 

  42. Cottini M, Lombardi C, Berti A, et al. Small-airway dysfunction in paediatric asthma[J]. Curr Opin Allergy Clin Immunol. 2021;21(2):128–34.

    Article  PubMed  Google Scholar 

  43. Topalovic M, Das N, Burgel PR et al. Artificial intelligence outperforms pulmonologists in the interpretation of pulmonary function tests[J]. Eur Respir J, 2019, 53(4).

  44. Das N, Happaerts S, Gyselinck I et al. Collaboration between explainable artificial intelligence and pulmonologists improves the accuracy of pulmonary function test interpretation[J]. Eur Respir J, 2023, 61(5).

  45. Giri PC, Chowdhury AM, Bedoya A, et al. Application of machine learning in pulmonary function assessment where are we now and where are we going?[J]. Front Physiol. 2021;12:678540.

    Article  PubMed  PubMed Central  Google Scholar 

  46. Wang Y, Li Y, Chen W, et al. Deep learning for spirometry quality assurance with spirometric indices and curves[J]. Respir Res. 2022;23(1):1–9.

    Article  Google Scholar 

  47. Das N, Verstraete K, Stanojevic S et al. Deep-learning algorithm helps to standardise ATS/ERS spirometric acceptability and usability criteria[J]. Eur Respir J, 2020, 56(6).

  48. Park H, Yun J, Lee SM, et al. Deep learning–based Approach to predict pulmonary function at chest CT[J]. Radiology. 2023;307(2):e221488.

    Article  PubMed  Google Scholar 

  49. Wichum F, Wiede C, Seidl K. Depth-based measurement of respiratory volumes: a Review[J]. Sensors. 2022;22(24):9680.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Wang Y, Li Q, Chen W, et al. Deep learning-based analytic models based on flow-volume curves for identifying ventilatory patterns[J]. Front Physiol. 2022;13:824000.

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

Not applicable.

Funding

This study was partly supported by the National Nature Science Foundation of China Grant (NSFC No.81800016), Sichuan Science and Technology Agency Grant (2019YFS0033). The funders had no roles in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Author information

Authors and Affiliations

Authors

Contributions

XWJ contributed to study design, manuscript writing and data analysis. SWY contributed to data acquisition and analysis. SXY and LLY contributed to study design and data interpretation. XXP and WYM contributed to data acquisition and interpretation. LBM contributed to study design and manuscript revision. All authors Read and approved the final manuscript. FJM contributed to the linguistic embellishment of the article as well as proofreading of the manuscript.

Corresponding author

Correspondence to Bin-Miao Liang.

Ethics declarations

Ethical approval and consent to participate

This study conformed to the Declaration of Helsinki and was approved by the Ethics Committee of West China Hospital, Sichuan University, China. All participants signed an informed consent before the procedure.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xu, WJ., Shang, WY., Feng, JM. et al. Machine learning for accurate detection of small airway dysfunction-related respiratory changes: an observational study. Respir Res 25, 286 (2024). https://doi.org/10.1186/s12931-024-02911-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12931-024-02911-1

Keywords