Our analyses show that improvement in FEV1 is significantly related to changes in the patient-reported outcomes TDI, SGRQ, exacerbation rate and rescue medication use over 12-52 weeks of treatment. These relationships were significant at both an individual and population level, although correlations were much stronger in the population-based analyses.
Few studies have examined the relationship between change in FEV1 and change in outcomes. However, our results are consistent with analyses of patients from the 3-year EUROSCOP (The European Respiratory Society Study on Chronic Obstructive Pulmonary disease) study, in which an improvement of 100 ml in FEV1 was associated with a 4% reduction in dyspnoea in males , and a 16-week clinical study, in which a significant, but weak correlation between change in FEV1 and change in SGRQ score was demonstrated (r = 0.33, P = 0.001) . Further, a recent systematic review of 22 studies found that 100 ml increase in FEV1 was associated with a statistically significant reduction in SGRQ of 2.5 . However, to our knowledge, the current analysis is the largest and most comprehensive to investigate the correlation between change in FEV1 and change in outcomes using individual patient data from studies of similar design. This provides a relatively homogeneous population for analysis, compared with study-level meta-analyses.
We demonstrated that a 100 ml increase in trough FEV1 (a magnitude of change with perceptible effects ) was associated with a 0.46-unit increase in TDI and a 1.3- to 1.9-unit improvement in SGRQ after a 24/26-week treatment period and, over 12-52 weeks of treatment, a 10% decrease in daily rescue medication use and a 12% decrease in the annual exacerbation rate. In general, we found that treatment, baseline severity, concomitant ICS use and world region, did not affect the slope of the relationship between outcome and change in FEV1, except for ΔSGRQ where more severe COPD, as characterised by a lower FEV1 and a higher SGRQ at baseline, was associated with a steeper slope, compared with less severe COPD. This is consistent with results from the 3-year TORCH (TOwards a Revolution in COPD Health) study, in which trends to greater improvement in SGRQ with worsening GOLD severity were noted with active treatments .
Although severe exacerbations showed a trend toward greater reductions with increasing ΔFEV1, the relationship was not statistically significant. While the observed 12% reduction in overall exacerbation rate for an improvement of 100 ml in FEV1 was comparable with previously published data , the studies included in our analysis were not powered to show an effect on exacerbations, and did not specifically recruit patients at risk of exacerbations.
We found inconsistent effects of different treatments across individual outcomes, perhaps due to patient numbers in sub-categories being too low to demonstrate consistent differences for individual treatments across all outcomes. However, our analysis did demonstrate that the relationship between ΔFEV1 and outcome appeared to be the same, regardless of treatment arm. Similarly, baseline severity, ICS use and world region were assessed as main effects, as well as for their potential influence on the effect of ΔFEV1. Although numbers of patients in GOLD 4 (as well as GOLD 1) were too small to make any inferences, patients predominantly in GOLD 3 at baseline, and those using ICS, consistently exhibited significantly worse outcomes. Indeed, the variability in baseline severity and ICS use are likely to have been major contributors to the large variability in observed outcomes.
The relationships between outcomes and ΔFEV1 may differ between negative and positive ΔFEV1, and for this reason, the models included a possible breakpoint at zero in the relationship slope. The inclusion of this breakpoint was found to be significant for TDI and ΔSGRQ, suggesting that baseline severity and other included covariates could not explain the observed behaviour fully. These results may have been influenced by differences in withdrawal rates between categories , since the highest withdrawal rate was in those with a negative change in FEV1, although differences between groups were minimal. The inclusion of a breakpoint was not significant for rescue medication and exacerbations, even though Figure 1 may have anticipated its importance, especially for exacerbations. The large variability and count nature of the data for rescue medication and exacerbations may have caused 'Type-2' statistical errors, i.e., failure to find the true breakpoints to be significant.
We found that zero change in FEV1 was associated with significant positive improvements in TDI and SGRQ. Additionally, while a greater proportion of patients achieved the MCID for TDI and SGRQ as ΔFEV1 increased, our results indicated that as many as 50% patients responded, irrespective of ΔFEV1, possibly an effect of clinical trial participation seen consistently in the placebo limb of clinical trials [28–30].
We constructed the models in our analysis using ΔFEV1 as a predictor, and the other outcome measures as the response variables, based on the results of a carefully-controlled series of clinical trials. However, ΔFEV1 was as much a response as was the outcome, so ΔFEV1 was not an 'independent' variable controlled as part of the experimental design. There may have been further confounders that simultaneously affected how both ΔFEV1 and the outcome responded to treatment. The fitted models therefore describe the observed relationships under the conditions of a clinical trial, but do not provide a definitive answer as to whether there is a causal relationship between ΔFEV1 and the outcomes.
The studies included in our analysis were powered on the spirometric endpoint FEV1, which is required by regulatory agencies for the approval of bronchodilators, and is included in the majority of treatment guidelines. For this reason we made FEV1 the focus of our analysis. Other physiological parameters such as inspiratory capacity may have stronger correlations with dyspnoea . However data for these parameters were not available from our dataset and further research is needed to investigate such correlations in large numbers of patients.