Updated reference values for static lung volumes from a healthy population in Austria

Background Reference values for lung volumes are necessary to identify and diagnose restrictive lung diseases and hyperinflation, but the values have to be validated in the relevant population. Our aim was to investigate the Global Lung Function Initiative (GLI) reference equations in a representative healthy Austrian population and create population-derived reference equations if poor fit was observed. Methods We analysed spirometry and body plethysmography data from 5371 respiratory healthy subjects (6–80 years) from the Austrian LEAD Study. Fit with the GLI equations was examined using z-scores and distributions within the limits of normality. LEAD reference equations were then created using the LMS method and the generalized additive model of location shape and scale package according to GLI models. Results Good fit, defined as mean z-scores between + 0.5 and -0.5,was not observed for the GLI static lung volume equations, with mean z-scores > 0.5 for residual volume (RV), RV/TLC (total lung capacity) and TLC in both sexes, and for expiratory reserve volume (ERV) and inspiratory capacity in females. Distribution within the limits of normality were shifted to the upper limit except for ERV. Population-derived reference equations from the LEAD cohort showed superior fit for lung volumes and provided reproducible results. Conclusion GLI lung volume reference equations demonstrated a poor fit for our cohort, especially in females. Therefore a new set of Austrian reference equations for static lung volumes was developed, that can be applied to both children and adults (6–80 years of age). Supplementary Information The online version contains supplementary material available at 10.1186/s12931-024-02782-6.


Introduction
Respiratory disease conditions are largely based on measurement of lung physiology.A disease can be described as a set of characteristics by which they differ from the norm in such a way that they are biologically disadvantaged [1].Reference values are used to help identify and diagnose individuals with abnormal values.Apart from measurement of forced maneuvers in spirometry, lung function can be described using lung volumes, determined by body plethysmography or gas dilution methods.Especially diagnosing restrictive lung disease only is possible by measuring the total lung capacity (TLC), thus requiring lung volumes [2].
The most commonly used reference values for lung volumes in adult populations are from the European Coal and Steel Community (ECSC), which were derived from data in 1983, and have limitations in terms of the inclusion of smokers and the lack of females [2,3].These are not applicable to children, and so separate reference values have to be used, the most common being based on work by Zapletal and colleagues published in the 1970s [3].Values by Rosenthal et al. were also published more than 20 years ago [4].Recognizing the need to update reference values for lung function testing, in 2012 the Global Lung Function Initiative (GLI) published multiethnic spirometry reference values that could be used across an age range of 3 to 95 years, with separate calculations for males and females [5].Subsequently, the GLI published reference values for static lung volumes that are applicable to assessment either by gas dilution methods or plethysmography [6].Whereas the GLI spirometry values are based on data from over 74,000 examinations and have been validated in a number of different populations [7], the static lung volume reference values are based on a more limited dataset of approximately 7,700 measurements [5,6] and require further validation.We therefore aimed to investigate the fit of the GLI lung volume equations in a cohort of healthy never smokers in Austria.If resulting in a poor fit for the Austrian population, creation of population-derived reference equations was planned.

Population and study design
The LEAD (Lung, hEart, sociAl, boDy) Study (Clinical-Trials.gov;NCT01727518; http:// clini caltr ials.gov) is an ongoing, longitudinal, observational, population-based cohort study that aims to provide a comprehensive database of risk factors for non-communicable diseases.The study has recruited a random sample (stratified by age, sex, and residential area) of males and females aged 6-80 years from Vienna and lower Austria that are representative of the general Austrian population, and who are being assessed every 4 years [8] since 2011.LEAD is being carried out according to the Declaration of Helsinki (2008) and has been approved by the Vienna local ethics committee (EK-11-117-0711).Written informed consent was given by all participants (or by parents or legal representatives for those aged under 18 years).
The current analyses focus on pre-bronchodilator data collected from the baseline visit.At each visit, all participants undergo spirometry and body plethysmography lung function testing by trained personnel at the LEAD study centre of the Ludwig Boltzmann Institute for Lung Health at the Clinic Penzing in Vienna, Austria.All measurements were conducted according to international recommendations (European Respiratory Society [ERS]/ American Thoracic Society [ATS]) [9,10], using BT-Mas-terScope Body 0478© (Jaeger, Germany) with the JLAB software.The body plethysmograph was calibrated daily using a 3 L syringe and a box pressure calibrator.Lung volume indices were expressed in body temperature pressure saturated conditions.
The lung function examination started with the subject sitting and breathing steadily, registering the pressureflow diagrams, and producing at least three reproducible diagrams.Functional residual capacity (FRC) was then measured by closure of the shutter at the end of a normal expiration.At least two FRC loops were obtained, with the subject breathing against the shutter at resting ventilation.The subject then carried out a maximal expiration to measure expiratory reserve volume (ERV), with residual volume (RV) calculated by subtracting ERV from FRC, followed by a slow, maximal inspiration, from which inspiratory capacity (IC) was measured.Finally forced expiratory volume in 1 s (FEV 1 ) and forced vital capacity (FVC) were assessed using forced spirometry, with three acceptable and reproducible loops obtained.Total lung capacity (TLC) was determined by adding RV to the best achieved vital capacity (VC), either from body plethysmography or spirometry.Strict regular quality control was in place for data collection and entry.
Age was registered in full days between the participants day of birth and the date of visit and is expressed in years with two decimals.Height was measured in centimeters without decimals.Weight was measured in kilograms with two decimals.

Definition of healthy never smoking respiratory cohort
All current and ex-smokers were excluded from the analyses.Participants with respiratory symptoms (wheeze, cough, sputum, or dyspnoea) in the last 12-months were also excluded, obtained using an interview-based questionnaire.Further subjects with a doctor's diagnosis of asthma, chronic obstructive pulmonary disease, chronic bronchitis, or emphysema were also excluded.
In order to avoid extreme outliers, patients with Z-scores ± 5 for height, weight or spirometric values were excluded from the analyses, and lung function reports of outliers were re-checked for errors and were evaluated for quality of the flow diagrams.Finally, we included only subjects with a complete set of pre-and post-bronchodilation spirometry and body plethysmography.As we believed this definition would describe pulmonary healthy subjects, no further exclusion criteria using spirometry or lung volumes were used.
To evaluate the cohort for single centre bias concerning pulmonary function testing, we included data from study participants, who underwent a second pulmonary function testing, using the same protocol, in the pulmonary function testing laboratory of the Clinic Penzing, Vienna.These were selected out of the initial study collective for bronchial challenge testing and do not necessarily correspond to the same subjects as in the healthy study cohort.

Statistical analysis
Z-scores were calculated for the cohort using the available GLI reference equations for pre-bronchodilation spirometry and lung volumes [5,6].Spirometry was included to check for general comparability to the GLI cohorts.Fit was analysed using the mean Z-scores, the 95% confidence intervals and the percentage above the upper limit of normal (ULN) and below the lower limit of normal (LLN).A good fit was to be concluded if: 1) the mean Z-score was between + 0.5 and -0.5 2) the standard deviation (SD) was approximately 1; and 3) ≤ 5% of the observations were below the LLN and ≤ 5% were above the ULN [11].
Population-specific reference equations were created based on the same, healthy cohort using the LMS method, consistent with GLI [5], as described earlier by Cole et al. [12], and the generalised additive model of location, scale and shape (GAMLSS) package in R (Version 4.2.2,R Foundation, Vienna, Austria, http:// www.r-proje ct.org).Equations were generated separately for males and females, with height and age being the predictive variables.The LMS method allows modelling of the skewness (lamda), the median (mu) and the coefficient of variation (sigma).Fit of the equations was determined using Q-Q plots, worm plots and the distribution of Z-scores.The Kolmogorow-Smirnow test was used to test for normal distribution, indicated by a p-value > 0.05.Degrees of freedom were adapted to achieve the lowest Schwartz-Bayesian-Criterion while avoiding overly complex models.

Results
The analyses used data from 5371 subjects (Fig. 1), including 2397 males (43.9%) and 2974 females (56.1%), aged from 6 to 80 years.The baseline characteristics of this cohort are shown in Table 1 for males and Table 2 for females.The majority of included individuals were between 6 to 30 years.A decline of lung function could be observed for both sexes, but more pronounced for FEV1 and FVC than lung volumes.In contrast, RV, RV/TLC and FRC grow larger with increasing age.
In a first step Z-scores were created using the GLI spirometry equations, to check for comparability to the Caucasian GLI cohorts.A good fit could be observed for all spirometry indices (Table 3).Females showed slightly lower numbers than the 5% expected under the LLN for FEV1 and FVC, especially at age > 65 years.

Existing reference equations for lung volume data
The fit of the GLI static lung volume equations were poor, as shown by the mean Z-scores in Table 4. Mean Z-scores for RV and RV/TLC using the GLI reference values were > ± 0.5 for both males and females, with fit also poor for TLC, IC and ERV in females.Furthermore, there was a shift towards higher values for all indices except ERV, as indicated by a higher proportion of values above the ULN than below the LLN.A absent normal distribution was demonstrated for all indices by an p < 0.05 in the Kolmogorow-Smirnow test.An acceptable fit could be observed for FRC, IC and ERV in males, especially in the age group between 18-65 years.

Creation of population-specific reference equations
Given the unsatisfactorily fit of the lung volume data when using the GLI reference equations, new equations were created using the LMS method (Table 5, Supplementary Figures. 1 and 2).Consistent with the approach used by GLI, subjects with calculated Z-scores > ± 5 were excluded before recalculating the equations, to avoid influence by extreme outliers.Look-up tables containing the varying coefficients were created and are available in the online supplement.All equations showed a good fit, with mean Z-scores of 0 and SDs of 1 (Table 6).Furthermore, all distributions were even with approximately 5% of subjects above and below ULN and LLN, respectively.All indices were normally distributed in the Kolmogorow-Smirnow test.

Intraindividual variability
As this was a single centre study, a measurement bias by operator or equipment couldn't be excluded.However, a subgroup of the LEAD cohort underwent an additional pulmonary function testing at a different site: participants with history of atopy, allergy, eosinophilia or positive skin prick test were selected for a bronchial challenge testing, which was carried out at the pulmonary function lab of the Clinic Penzing.The protocol and equipment were the same type as in the study centre, being a BT-MasterScope Body 0478 (Jaeger, Germany).Normal spirometry and plethysmography were carried out, tough only TLC, RV and ERV were available in the database.During Phase 1765 individuals underwent the additional testing, after excluding all with missing or invalid data, 706 participants remained.As the mean interval between the measurements was 40 months, a manual quality check was carried out, to exclude children and adolescents with large differences between the dates due to natural growth, contributing to the high number of exclusions.In the end, data of 602 participants were analysed.As the mean intraindividual difference was < 100 ml for all included parameters (FEV1, FVC, ERV, RV, TLC), a single centre bias of measurements seemed unlikely.(Table 7).

Discussion
These analyses use cross-sectional data obtained from a broad, representative healthy population sample from Austria to investigate the fit of the GLI lung volumes reference equations.As the GLI equations failed to demonstrate a good fit with our population-based data in normal subjects, a new set of sex-specific reference values was created for lung volumes.
Reference values are indispensable when interpreting lung volumes in clinical practice, using the LLN with TLC and ULN with RV for defining restrictive impairment and hyperinflation respectively [13].Until recently, assessments in Austria and Europe relied mostly on the ECSC reference equations for adults, despite several studies having demonstrated inconsistencies between these reference equations, so the update by GLI was highly anticipated [14][15][16].
When using the GLI spirometry equations in our population a good fit was observed.We therefore considered our cohort comparable to the Caucasian cohorts used by GLI to create equations for spirometry and lung volumes.While small differences exist especially for females, we consider the equations sufficient for the detection of obstructive anomalies in our cohort [17].This is consistent with previous analyses reporting a good fit with the GLI spirometry equations for other European cohorts [7,18].While some authors still report significant differences [19], the GLI equations, at least for Caucasian populations, offer consistent cut-offs and improved comparability between cohorts.The large amount of collated data, smoothing out small differences between populations, seems one of the main advantages.Additionally, even ethnic-specific equations created by GLI are available for spirometry.But the accuracy of these compared to globally merged equations was questioned lately [20].
However, GLI lung volume reference values did not fit well within our cohort.Large differences were observed, with mean Z-scores > 0,5 for TLC, RV and RV/TLC.Also, the percentage under the LLN and over the ULN was lower and higher respectively than expected.The difference was even more pronounced in females including significant differences for IC and ERV.These deviations could lead to an under-detection of restrictive disorders and overdiagnosis of hyperinflation in the Austrian population.
So far there is few data about the performance of the new GLI equations in European cohorts.The number of observations for lung volumes was much lower than for spirometry, and no equations are available for different ethnic backgrounds than Caucasian.A recent study from Belgium found similar results, with the GLI equations underestimating especially the values for RV [21].Furthermore, the percentage under the LLN was lower than the expected 5% for TLC.A study in Algerian adults also reported, despite good fitting GLI spirometry values, similar results for RV, RV/TLC and TLC [22].One potential explanation for the poor fit of the GLI lung volume equations is that our data were collected recently (starting 2011).Longitudinal studies have shown that populations are getting taller and healthier [23], with average population lung function increasing [24][25][26][27], potentially influenced by socioeconomic factors, or reduced occupational or environmental exposure [25,28].While in literature the impact of these developments in lung function is still discussed, the large size of our cohort might especially contribute to visible differences [29].
There were less obese and overweight individuals in our cohort compared to GLI.As the significance of weight as predictor of static lung volumes is not yet conclusively understood [6], we used weight as an predictive variable in an early version of the equations.This only minimally altered the coefficients, and so wasn't used further (data not shown).While weight seems to have only a small impact on overall lung volume reference equations, the effect of body composition could be more important and may explain some of the differences between cohorts.
Future analyses could investigate and include the effect of body compartments on lung volumes.
Other factors contributing to the need to revisit equations could be changes in methods and equipment.Various studies in patients with obstructive lung diseases have demonstrated significant differences between lung volumes measured by gas dilution methods versus plethysmography, although the situation in healthy individuals is less clear [30].Indeed, GLI found statistically significant differences between these two methods in their cohort, but regarded the differences as not clinically relevant, although the majority of their data were derived from plethysmography [6].In addition, use of different body plethysmography devices and software could potentially impact the results.For example, in GLI devices manufactured by JAEGER (which we used in our study) measured somewhat higher values than those from other manufacturers, especially for RV [6].Recently, authors from COSYCONET demonstrated differences in FRC up to 0.67 L between two manufacturers [31].So, while the simplicity of one equation spanning different techniques, equipment, and populations is one argument for the use of the GLI equations, this might not appropriately represent all different populations and methods.It is to be expected that reference values derived directly from the specific examined population would fit that population better than standardised equations -although it is important that for such populationbased equations to be useful, the examined population has to be representative of the broad population, as has been shown to be the case with the LEAD cohort [8] Still, adding more data to the GLI equations, may in the future improve the generalizability and render population based equations obsolete.
In this study the population derived reference equations from LEAD demonstrated a superior fit for all lung volume indices compared to the GLI equations.Lung volumes in our cohort were influenced by sex, age and height.Some studies have included weight as predictive variable for lung function [15,16,32], but as with GLI we found only a small influence of weight [6], and our equations therefore do not need to include this parameter.Importantly, we included obese individuals in our analyses, since reference values should be generalisable to the intended population [17].Our newly derived equations might be usable in other European countries with similar population characteristics and equipment.This will have to be analysed in future studies.

Strengths and limitations
Our analyses were conducted according to the ERS/ATS workshop report requirements [2].While these published already over 20 years ago, they are still the most recent criteria available.We used strict selection criteria for our healthy cohort, only including never smokers, and excluding those reporting any respiratory symptoms.In addition, the population was distributed over all age groups, although with an overrepresentation of children, adolescents and of females, potentially due to the exclusion of those with a smoking history.We used standardised methods for the measurement of lung volumes, with strict quality control [8], and to create the reference equations we used the same statistical models as GLI.In particular, the LMS model allows the equations to cover the entire age range, avoiding discrepancies when entering the adult age [5].
The main limitation of our analyses is the single centre aspect of our lung function testing.The comparison of measurements done in another site showed only very small, not clinically relevant differences.Still a systemic bias can't be ruled out, as only the device and software of one manufacturer was used.This also limits generalisability to other equipment and software.Furthermore, our cohort included no individuals aged < 6 years and > 80 years, so we recommend the use of our equations only between the ages of 6 and 80 years.Ethnicity wasn't documented, as participants of the LEAD study, corresponding to the Austrian population, were predominantly of European ancestry.The Austrian population is known to consist just a very minor part of subjects different than Caucasian ancestry, so ethnicity wasn't considered in the initial study design.Therefore the reference values are only applicable to similar Caucasian populations.We used strict exclusion criteria, but still subjects with physiologically abnormal lung function measurements or undiagnosed respiratory disease could have been present in the analysed cohort.

Conclusion
In our cohort the GLI lung volume reference equations demonstrated a poor fit for RV, RV/TLC and TLC, especially in females.We therefore developed a new set of Austrian reference equations for static lung volumes that, unlike most reference values, can be applied to both children and adults, from the ages of 6 to 80 years.

Fig. 1
Fig. 1 Flow chart for selection of a healthy, asymptomatic cohort

Table 1
Baseline characteristics of the healthy cohort malesAbbreviations:ERV expiratory reserve volume, FEV1 forced expiratory volume in 1 s, FVC forced vital capacity, FRCpleth functional residual capacity measured by plethysmography, IC inspiratory capacity, RV residual volume, TLC total lung capacity

Table 2
Baseline characteristics of the healthy cohort females Abbreviations:ERV expiratory reserve volume, FEV1 forced expiratory volume in 1 s, FVC forced vital capacity, FRCpleth functional residual capacity measured by plethysmography, IC, inspiratory capacity, RV residual volume, TLC total lung capacity

Table 3
GLI reference equations for spirometry in males and females of the LEAD cohortAbbreviations:CI confidence interval of mean Z-scores, FEV 1 forced expiratory volume in 1 s, FVC forced vital capacity, GLI Global Lung Function Initiative, KS Kolmogorov-Smirnov test for distribution of mean Z-scores, LLN lower limit of normal, ULN upper limit of normal

Table 4
GLI reference equations for lung volumes in males and femalesAbbreviations:CI confidence interval of mean Z-scores, ERV expiratory reserve volume, FRC functional residual capacity, GLI Global Lung Function Initiative, IC, inspiratory capacity, KS Kolmogorov-Smirnov test for distribution of mean Z-scores, LLN lower limit of normal, RV residual volume, TLC total lung capacity, ULN upper limit of normal

Table 5
Static lung volume equations for males and females (LEAD 2021)Abbreviations: ERV expiratory reserve volume, FEV 1 forced expiratory volume in 1 s, FVC forced vital capacity, FRC functional residual capacity, IC inspiratory capacity, RV residual volume, TLC total lung capacity, Mu-Spline and sigma-Spline correspond to the age-varying coefficients, available as a look-up table in the online supplement

Table 7
Mean measured values and intrasubject variations between two visits Data are mean (± SD)Abbreviations: ERV expiratory reserve volume, FEV 1 forced expiratory volume in 1 s, FVC forced vital capacity, RV residual volume, TLC total lung capacity