Monitoring of noninvasive ventilation: comparative analysis of different strategies

Background Noninvasive ventilation (NIV) represents an effective treatment for chronic respiratory failure. However, empirically determined NIV settings may not achieve optimal ventilatory support. Therefore, the efficacy of NIV should be systematically monitored. The minimal recommended monitoring strategy includes clinical assessment, arterial blood gases (ABG) and nocturnal transcutaneous pulsed oxygen saturation (SpO2). Polysomnography is a theoretical gold standard but is not routinely available in many centers. Simple tools such as transcutaneous capnography (TcPCO2) or ventilator built-in software provide reliable informations but their role in NIV monitoring has yet to be defined. The aim of our work was to compare the accuracy of different combinations of tests to assess NIV efficacy. Methods This retrospective comparative study evaluated the efficacy of NIV in consecutive patients through four strategies (A, B, C and D) using four different tools in various combinations. These tools included morning ABG, nocturnal SpO2, TcPCO2 and data provided by built-in software via a dedicated module. Strategy A (ABG + nocturnal SpO2), B (nocturnal SpO2 + TcPCO2) and C (TcPCO2 + builtin software) were compared to strategy D, which combined all four tools (NIV was appropriate if all four tools were normal). Results NIV was appropriate in only 29 of the 100 included patients. Strategy A considered 53 patients as appropriately ventilated. Strategy B considered 48 patients as appropriately ventilated. Strategy C misclassified only 6 patients with daytime hypercapnia. Conclusion Monitoring ABG and nocturnal SpO2 is not enough to assess NIV efficacy. Combining data from ventilator built-in software and TcPCO2 seems to represent the best strategy to detect poor NIV efficacy. Trial registration Institutional Review Board of the Société de Pneumologie de Langue Française (CEPRO 2016 Georges)


Background
Non-invasive ventilation (NIV) is recognized as an effective treatment of chronic hypercapnic respiratory failure (CHRF) [1]. Due to growing evidence of NIV efficacy in a broad range of indications as well as increasing availability of high performance and user-friendly home ventilators, the number of patients receiving NIV at home has been regularly increasing over the past 30 years [2][3][4]. When NIV is initiated to treat CHRF, ventilator settings are empirically determined based on the underlying disease, patient tolerance and diurnal changes in arterial blood gases (ABG) [5]. However, NIV is usually applied during the night. As a result, daytime adjustment of ventilator settings may not achieve optimal nocturnal ventilatory support. This can be explained by sleep-related changes in breathing. Sleep induces modifications in ventilatory control, respiratory muscle recruitment and upper airway patency, which may all affect ventilatory function especially in patients with CHRF [6]. Moreover, applying intermittent positive pressure may by itself trigger abnormal respiratory events [7]. For instance, reduction of ventilatory drive with or without glottic closure, residual upper airway obstruction and patient-ventilator asynchrony can all compromise the efficacy of NIV [7]. Furthermore, as NIV uses a non-airtight system, unintentional leaks are frequent [8]. Leaks during NIV can interfere with patient-ventilator interaction [9]. These respiratory events are frequent under NIV [8,[10][11][12][13] and may have an impact on prognosis [14][15][16].
Therefore, NIV should be systematically monitored. However, optimal modalities for monitoring of long-term ventilated patients remain a matter of debate. Hence, physicians may adopt different approaches to assess NIV performance. Some authors suggest that complete polysomnography (PSG) under NIV should be performed for each patient under NIV to verify its efficacy [7,17]. This technique is not feasible in many centres on a routine basis. In contrast, the 2010 American Academy Sleep Medicine (AASM) recommendations for best clinical practices state that patients on long term NIV should be assessed regularly with measures of oxygenation and ventilation (i.e.: ABG, nocturnal pulse oximetry, end tidal CO 2 or transcutaneous capnography) [18,19]. Over the past years, the use of TcPCO 2 has been simplified. Home ventilators have built-in software that provide detailed information on relevant ventilator parameters to assess the efficacy of NIV. A step-by-step strategy starting by ABG and nocturnal SpO 2 has been proposed by the Som-noNIV group [19]. However, few studies have evaluated these proposed monitoring strategies in clinical practice [20].
This study aimed to compare the accuracy of four different strategies using four easily available assessment tools in different combinations to determine NIV efficacy during elective evaluations of patients on long-term NIV.

Methods
All patients under long-term home NIV followed by the Pulmonary Department of Dijon University Hospital are hospitalized electively for one night on a regular basis to assess efficacy of their NIV. These admissions are scheduled by the attending specialist every 3 to 12 months: intervals depend on the underlying respiratory disease and its progression rate, prior assessment of NIV efficacy or tolerance and intercurrent medical events.
In this retrospective comparative study, we included consecutive patients treated with long term NIV and hospitalized in our unit for an elective follow-up visit over a 1 year period. Inclusion criteria were: use of a home bi-level pressure support ventilator (VPAP ™ , Res-Med, North Ryde, Australia) and being in a stable clinical condition for at least 3 months prior to inclusion.
Exclusion criteria included: age below 18 years, oxygen supplementation, use of a ventilator from other manufacturers, mean daily NIV use of less than 4 h per night, inability to cooperate and change in NIV treatment in the preceding 3 months.
NIV was evaluated with usual ventilator settings and interface. We simultaneously recorded overnight for each patient four monitoring tools: (1) morning ABG measured during spontaneous breathing by puncture of the radial artery during the first hour after disconnection from the ventilator, (2) nocturnal pulsed oxygen saturation (SpO 2 ; Nonin model 8500 oximeter, Nonin Medical, Plymouth, MN, USA), (3) transcutaneous capnography (TcPCO 2 : Tosca ® , Radiometer, Copenhagen, Denmark) and (4) data from a simplified monitoring module coupled to their portable ventilator (Reslink ™ , ResMed). Data from the ventilator software were collected on a Smart Media card (Scandisk, Milpita, CA, USA) then downloaded with Rescan ™ software (ResMed, North Ryde, Australia). The software provided an accurate estimation of non-intentional air leaks (i.e. leaks exceeding what was expected from the exhalation valve of the interface used) [8]. The additional connection of a pulse oximeter allowed simultaneous recording of nocturnal SpO 2 .
We evaluated the efficacy of NIV through four strategies (A, B, C and D) using the results of four different tools, in different combinations: strategy A combined ABG and nocturnal SpO 2 , the minimal recommended monitoring combination [19]; strategy B combined nocturnal SpO 2 and TcPCO 2 : since transcutaneous capnography provides SpO 2 and TcPCO 2 simultaneously, both parameters could be analyzed concurrently; strategy C combined TcPCO 2 and data from built-in ventilator software and strategy D associated all the available tools (i.e. ABG, nocturnal SpO 2, TcPCO 2 and data from ventilator software). Strategy D is used to classify patients as appropriately ventilated or not. If none of the above-mentioned criteria were fulfilled, NIV was considered effective.
The St. Mary's Hospital questionnaire was completed in the morning after the overnight assessment to evaluate subjective sleep quality on a 12 point scale [24]. Another questionnaire assessed the self-perceived quality of ventilation using an eight-item visual analogic scale (10 points per item) covering three domains: patient-ventilator synchronisation, efficacy and leaks [25]. Higher values indicated better treatment comfort, with a maximum score of 80.
The study was approved by the Institutional Review Board of the Société de Pneumologie de Langue Française.

Statistical analysis
Statistical analyses were performed using SigmaPlot 13 software (Systat Software, San Jose, CA, USA). The normality of the distribution of the variables analysed was assessed using the Kolmogorov-Smirnov test. As most data were not normally distributed, we reported results as median and quartiles and used non-parametric tests. We used the Mann Whitney's U test to compare "appropriately" and "inappropriately" ventilated patients for continuous variables. Categorial variables (gender, interfaces) were compared using a χ 2 test. For comparisons between three or more groups (classification of patients according to the aetiology of chronic respiratory failure), we used the Kruskal-Wallis test; subsequent paired comparisons were made using a post-hoc Dunn's analysis. Statistical significance was set at p < 0.05 or p < 1 − (1 − α) 1/k for multiple comparisons where α = 0.05 and k denotes the number of comparisons.
The agreement between different methods of NIV monitoring and the strategy D was evaluated with Cohen's kappa coefficient [26].
We used receiver operating characteristic (ROC) curves to evaluate the performance of nocturnal SpO 2 and ABG to identify patients classified as adequately ventilated according to strategy D. We considered agreement to be sufficient if the lower bound of 95% confidence interval for the area under the ROC curve was > 0.7. ROC curve analyses were also used to determine the most suitable threshold values of mean nocturnal SpO 2 and morning PaCO 2 for assessing NIV efficacy.

Results
One hundred and thirty-four patients were screened. Two subjects were excluded due to corruption of raw data from the ventilator software. Thirty-two patients under oxygen therapy were also excluded from further analyses. These subjects suffered more often from obstructive lung diseases (OLD) and presented more severe diurnal and nocturnal hypercapnia (p < 0.001).

Study population
The remaining 100 patients were treated with NIV for OLD (n = 25), chest wall diseases (CWD, n = 29) and neuromuscular diseases (NMD, n = 46) according to the Eurovent diagnostic groups [2] (Table 1). Demographic characteristics, ABG, TcPCO 2 and ventilator settings are summarized in Table 2. As expected, NMD patients were younger, had a lower BMI and required lower levels of pressure support to reach more effective control of diurnal and nocturnal hypercapnia. Nasal masks were used more frequently in this group than in OLD or CWD subjects (p < 0.05).

Assessment of NIV efficacy
TcPCO 2 revealed significant nocturnal hypoventilation in 27% of the patients. Among them, 6% had normal ABG and 12% had normal nocturnal SpO 2 . Data from built-in ventilator software were abnormal in 57% of the patients. Leaks represented the most common abnormality (28%). Table 3 compares the performances of different strategies. NIV was appropriate in only 29% of patients. No significant differences were found regarding ventilator settings or interfaces between appropriately and inappropriately ventilated patients. NIV compliance did not differ significantly between appropriately and inappropriately ventilated patients (8.5 [6.9-10] vs. 7.5 [6.1-9.9] hours per night, respectively).
With strategy A, 53% of patients were considered appropriately ventilated. Among 48% of patients with normal results using strategy B, data from built-in ventilator software identified major leaks in 18% and significant drops in SpO 2 associated with decreases in flow despite effective ventilator pressure in 10% of patients.    Table 4 presents ROC curve analysis of optimal threshold value of ABG and nocturnal SpO 2 for identifying appropriately ventilated patients (defined by strategy D).

Optimal threshold values for PaCO 2 and SpO 2 for identifying suboptimal NIV according to strategy D
A morning PaCO 2 value of 42 mmHg was the best threshold for identifying appropriate NIV (Fig. 1a): 69% of the patients were correctly classified using this value.
The best threshold for time spent with SpO 2 below 90% was 5% (Fig. 1b): 63% of the patients were correctly classified using this value. Higher values for time spent with SpO 2 below 90% had a lower sensitivity with a similar specificity.

Subjective assessment of quality of sleep and comfort of ventilation
Perceive quality of sleep (Fig. 2a) and comfort of ventilation (Fig. 2b) did not differ significantly between appropriately and inappropriately ventilated patients.

Discussion
In this real-life study, we compared different strategies to assess the efficacy of NIV. Our results suggest that using a combination of daytime ABG and nocturnal SpO 2 (referred to as strategy A, proposed by the group of experts [19]) was not sensitive enough to assess NIV efficacy. A significant part of this group had residual nocturnal abnormalities under NIV (hypoventilation, unintentional leaks or abnormal events). In this group, withholding from performing further NIV testing could be deleterious. A combination of TcPCO 2 and data from ventilator software, referred to as strategy C, was the most accurate non-invasive strategy for assessing NIV efficacy. Improving NIV efficacy is an important issue in patients with long-term NIV: residual respiratory events under NIV may have a negative impact on patient-related outcomes such as symptoms, health-related quality of life and survival. Nocturnal hypoventilation is associated with a decreased survival rate, especially in neuromuscular diseases [14,16], as well as adverse neuro-cognitive and cardiovascular consequences in chronic respiratory failure [27]. Leaks above 0.4 l/s [28] may induce patientventilator asynchrony [12,29], alter quality of sleep [30][31][32][33] and potentially decrease health-related quality of life. Abnormal respiratory events under NIV (upper airway obstructive events with or without nocturnal desaturations or residual hypoventilation or symptoms) are associated with a decreased survival rate in patients suffering from amyotrophic lateral sclerosis (ALS) [15].
To detect residual nocturnal hypoventilation, we suggest using TcPCO 2 instead of morning ABG. In ventilated patients, PaCO 2 measured by arterial puncture may not provide an accurate picture of the overnight time course of PaCO 2 [19,22]. Several studies have shown that continuous TcPCO 2 recording is well correlated with arterial measurements in chronic respiratory failure under NIV [10,34,35].
Experts propose different thresholds to assess the efficacy of NIV but little evidence substantiates the relevance of these values. Regarding TcPCO 2 , several thresholds have been suggested to define significant nocturnal hypercapnia: maximal TcPCO 2 > 49 mmHg [36,37]; TcPCO 2 > 49 mmHg for > 10% of recording time [22]; TcPCO 2 > 55 mmHg for ≥ 10 min or an increase in TcPCO 2 ≥ 10 mmHg above awake supine value to a value exceeding 50 mmHg for ≥ 10 min [18]. Clinically relevant threshold values may differ according to 1/the method and device used, 2/the etiology of chronic respiratory failure, 3/the goal of TcPCO 2 recording (i.e. to decide when NIV should be initiated or to monitor NIV efficacy) and 4/PCO 2 levels when NIV is started. For example, prognosis is improved in COPD if NIV effectively reduces PaCO 2 by more than 20% [38]. The thresholds used may also depend on the type of capnograph as bias between arterial and transcutaneous values changes according to the device used [39]. The device used in our study slightly overestimated PaCO 2 . The maximal bias published with this device was 5.6 ± 3 mmHg [40]. We therefore considered residual nocturnal hypoventilation as significant when mean TcPCO 2 was ≥ 50 mmHg [41].
The clinical contribution of nocturnal transcutaneous capnography can be improved by simultaneously recording SpO 2 [19]. Sampling rate and averaging of SpO 2 and TcPCO 2 recordings are different: SpO 2 can detect short desaturations linked to short ventilatory events while TcPCO 2 has a longer lag time but is an accurate tool to evaluate overnight trends in ventilation. Hence, both tools are complementary and devices used in clinical practice combine TcPCO 2 and SpO 2 sensors. However, capnography does not provide information about the underlying pathophysiological mechanisms. Furthermore, in a quarter of patients with normal TcPCO 2 and SpO 2 (strategy B), we found significant leaks or abnormal residual respiratory events (ie, flow reduction or patientventilator asynchronies). Our study confirms the additional contribution of data from ventilator software for the detection of these events. The accuracy of the ResScan ™ system used to assess leaks has been confirmed in a bench model by our group and others [8,42].
Our results suggest that using more severe thresholds for PaCO 2 and NPO may compensate their lack of sensitivity. For instance, using a PaCO 2 threshold value of 42 mmHg could increase the accuracy of ABG for the detection of nocturnal hypoventilation.
Time spent with a SpO 2 below 90% is the most frequently used parameter to interpret nocturnal pulse oximetry, but threshold values vary considerably between authors and aetiologies. In non-ventilated patients suffering from chronic obstructive pulmonary disease (COPD), Levi Valensi et al. [43] documented a shorter survival in patients spending more than 30% of total sleep time with an SpO 2 below 90%. More recently, Gonzalez-Bermejo et al. [14] showed that ALS patients under NIV had a better survival if less than 5% of NPO time was spent with an SpO 2 < 90%. In our study, using a threshold of 5% increased the accuracy of NPO in detecting residual nocturnal hypoventilation.
An analysis combining the signals provided by TcPCO 2 and data from ventilator software may be an interesting option for monitoring NIV, offering a noninvasive global estimation of NIV efficacy without requiring ABG. Moreover, this approach enables unattended assessment both at the hospital and at home without complex logistics. Failure to retrieve data is rare [44] and instrumental drift of TcPCO 2 is a minor problem when used by an experienced team [20,39,45,46]. Interpretation of the results is simple and further analysis of detailed raw data provided by ventilator software can help clarify the underlying mechanism implicated in NIV inefficacy. This may allow optimization of ventilator settings limiting PSG to more complex cases. Unfortunately, use of TcPCO 2 is at present still limited by the cost of the devices.
We acknowledge a few limitations to our study. Firstly, we did not perform full PSG under NIV. Even if PSG allows the evaluation of patient-ventilator interactions and characterization of abnormal respiratory events occurring under NIV [7], the impact of these events on morbidity and related therapeutic end points remains speculative [47]. Furthermore, it does not provide an accurate estimation of alveolar ventilation per se, which is the main goal of ventilator assistance. It is also probable that leaks could be underscored by PSG.
Secondly, we excluded 32 patients with nocturnal NIV and oxygen therapy. Supplemental oxygen impacts on SpO 2 values and reduces the amplitude of desaturations, decreasing the reliability of NPO to assess NIV efficacy. It must be noted that the majority of excluded patients suffered from chronic obstructive pulmonary disease.
Thirdly, NIV is considered beneficial if used more than 4 h per night (for ALS [48]; for COPD [49]; for obesityhypoventilation syndrome [50]). We also excluded patients using NIV for less than 4 h per night. Poor compliance to NIV may result from discomfort related to leaks or a low perceived benefit of treatment. This could have underestimated the proportion of inadequately ventilated patients even if leaks represent the most frequent abnormality in our study.
Fourthly, we failed to show an impact of NIV efficacy on sleep quality or patient symptoms. Both scores employed for assessing comfort and quality of sleep have been previously used to assess subjective impact of changes in ventilator modes (volume-targeted versus conventional bi-level pressure support) [25]. Our results suggest that subjective assessment does not suffice for the detection of inappropriate ventilation. The poor correlation between residual respiratory events and patients' perception has been previously reported [9,10]. Finally, the impact of NIV efficacy on survival could not be assessed due to the heterogeneity of our population consisting of subgroups (OLD, CWD, NMD) with different prognoses. Further investigations are needed to identify which of the selected tools included significantly impacts on patient-related outcomes such as symptoms, health-related quality of life or survival.
In summary, this study shows that combining morning ABG and nocturnal SpO 2 is not sufficient to accurately assess NIV efficacy. An alternative strategy combining data from ventilator software and TcPCO 2 performed better for detecting inappropriate NIV without requiring ABG. Models of care for chronically ill patients living at home are evolving with telemonitoring. TcPCO 2 and ventilator software data are increasingly available at home. Moreover, their easy interpretation makes it feasible in real life and in a variety of clinical settings. This combination may be very useful in future strategies for long-term NIV monitoring.