Derivation of normative data for the COPD assessment test (CAT)

Background The tradition classification of the severity of COPD, based on spirometry, fails to encompass the heterogeneity of the disease. The COPD assessment test (CAT), a multi-dimensional, patient-filled questionnaire, assesses the overall health status of patients, and is recommended as part of the assessment of individuals with COPD. However, information regarding the range of values for the test in a non-COPD population (normative values) is limited, and consequently, knowledge regarding the optimal cut-off, and the minimum clinically important difference (MCID) for the test remain largely empirical. Methods CanCOLD is a population-based multi-center cohort study conducted across Canada, the methodology of which is based on the international BOLD initiative. The study includes subjects with COPD, at-risk individuals who smoke, and healthy control subjects. CAT questionnaires were administered at baseline to all subjects. Among non-COPD subjects, normative values for the CAT questionnaire, and psychometric properties of the test were characterized. Predictors of high CAT scores were identified using multivariable logistic regression. Results Of the 525 non-COPD subjects enrolled, 500 were included in the analysis. Mean FEV1/FVC ratio among the 500 included subjects was 0.77 (SD 0.49); the mean predicted FEV1 was 99.38% (SD 16.88%). The overall mean CAT score was 6 (SD 5.09); scores were higher among females (6.43, SD 5.59), and subjects over 80 years of age (mean 7.58, SD 6.82). Cronbach alpha for the CAT was 0.79, suggesting a high internal consistency for the test. A score of 16 was the 95th percentile for the population, and 27 subjects (5.4%) were found to have a CAT score > =16. Current smoking (aOR 3.41, 95% CI 1.05, 11.02), subject-reported physician-diagnosed asthma (aOR 7.59, 95% CI 2.71, 21.25) and musculoskeletal disease (aOR 4.09, 95% CI 1.72, 9.71) were found to be significantly associated with a score ≥16. Conclusions The characterization of CAT scores in the general population will be useful for norm-based comparisons. Longitudinal follow-up of these subjects will help in the optimization of cut-offs for the test.


Background
Chronic Obstructive Pulmonary Disease (COPD) is the 4th leading cause of mortality worldwide, and causes significant morbidity [1,2]. Though spirometry is required for the diagnosis of COPD, it is not a substitute for measuring patient perspective with respect to symptoms, function or overall health condition. More recently, the GOLD (Global Initiative for Chronic Obstructive Lung Disease) strategy document has acknowledged the increasing evidence that FEV 1 alone is inadequate for describing COPD status [3]. The new recommended approach represents a move towards individualized treatment for COPD patients, matching therapy more closely to a multidimensional assessment of specific patient attributes such as symptoms, spirometric classification and evaluation of the risk of future adverse events, particularly exacerbations. GOLD recommends the COPD Assessment Test (CAT) or mMRC scores to differentiate between patients experiencing a low or high burden of symptoms.
The development of the CAT questionnaire and selection of the items was done based on the testimonies of patients about COPD and the effect that it has upon them [4]. The final CAT consists of eight items, each formatted as a six-point differential scale, making the tool easy to administer [5]. The psychometric properties of the questionnaire have been studied and validated in the patient population in the clinical setting, and the test appears to have a good construct and discriminant validity [5][6][7][8], and is responsive to changes in health status of patients with COPD [9][10][11]. A study from our group, which included both COPD and non-COPD subjects selected from the population, and from one of the sites of the cohort included in the present study, suggested that the overall CAT score had a statistically significant association with a diagnosis of COPD [12].
Although the CAT questionnaire was developed for use among patients with COPD, the interpretation of measures of COPD health status may represent a considerable challenge with respect to the magnitude and gravity of the scale of responses. Norm-based comparisons, derived from the general, non-COPD population, are a strategy that can help resolving these challenges. They can also help in testing hypotheses for clinical trials that are targeted at improving patient-centered outcomes, knowing the clinical relevance of a score to adequately power such trials. The CAT questionnaire has been tested in general population in the Middle East, as part of the BREATHE study [8], although the participants were poorly characterized in terms of physiological measurements to confirm the presence or absence of COPD. It has also been tested among healthy industrial workers in Japan [13]. Normative values for the test in a Western setting are not known.
The present study was performed to gain population norms for CAT in a well-characterized, populationbased sample. The primary objective of the present study was to derive the normative values for the CAT questionnaire for the adult general population (≥40 years of age), and the norms by age and sex. The secondary objectives included: a) to identify subjects who have a CAT score that was equal to, or greater than the 95th centile for this population, and to identify predictors, using an adjusted multivariable analysis, that were different in these subjects when compared to individuals with scores below the 95th centile; b) to test the psychometric properties of CAT in the general population, including internal consistency and validity.

Study population
The Canadian Cohort Obstructive Lung Disease (Can-COLD) study is a prospective longitudinal cohort study that is designed to track 1500 subjects across Canada (recruited using a random sampling frame from the populations of 9 urban/suburban areas) [14]. The study is based on the international BOLD initiative [15]. The cohort comprises two COPD balanced subsets (GOLD ≥ 2 and GOLD 1) and two age-and sex-matched non-COPD peers (ever-smoker at risk and healthy controls, i.e. never smokers with post-BD FEV1/FVC > 0.70). Assessments are made at baseline, 18 months, 3 years and beyond, and include for non-COPD subjects the CAT questionnaire the SF-36 questionnaire, complete pulmonary function and cardio pulmonary exercise assessments, CT scans of the chest and blood tests. The details of the sampling strategy and assessments have been described in the published protocol [14]. For the present study, we used the CAT, filled at baseline, by the subjects recruited in the non-COPD subsets (ever-smoker at risk, healthy non-smoker controls) of the CanCOLD study. The Can-COLD study was approved by the REB at the McGill University Health Centre (MUHC) -Study # 09-025-BMC.

CAT questionnaire
The CAT ( Figure 1) has 8 questions (items) and includes constituent items that assess cough, production of phlegm, chest tightness, breathlessness, activity limitation, confidence, sleep and energy. The minimum score possible (floor) for each item is zero, and the maximum (ceiling) is five. Thus, the overall score can have a value ranging from 0 -40.

Normative values for the CAT questionnaire
Normative values for the CAT questionnaire were described using the mean, standard deviation, range (minimum and maximum), distribution of scores among males and females by percentiles, the percentage of responders at the lowest score possible (floor) and the highest score possible (ceiling) for the overall score, categorized by age groups. Participants were divided into five age categories, 40-49 years, 50-59 years, 60-69 years, 70-79 years and 80 years and above.

Identification of predictors of high CAT scores
From the descriptive statistics, we dichotomized subjects into those with CAT scores equal to and above the 95th centile in the study population (defined as a high CAT score), and those with scores below the 95th centile. We assessed the association of a high CAT score with variables defined a priori. These included sex, age, being a current smoker, forced expiratory volume in the first second (FEV 1 ) as a percentage of the volume predicted by the NHANES equation for the subject, and presence of comorbidities including physician-diagnosed asthma, cardiovascular disease and musculoskeletal disease. Variables found to be significantly associated with a high CAT score were then analyzed using a multivariable logistic regression model to identify which of the identified variables were predictors of a high CAT score. We also performed the above analysis using the suggested cut-off of 10 for the CAT score being abnormal.

i) Face validity
We analyzed the overall CAT scores among smokers and non-smokers by sex to assess face validity.
ii) Internal consistency The internal consistency of the eight items in the CAT was assessed using Cronbach alpha, which is a coefficient of internal consistency. It is a function of the correlation of an item on the questionnaire with the overall test score (item-test correlation) and with the other items on the questionnaire (item-rest correlation). Values are graded as excellent (≥0.9), good (0.8 ≤ α < 0.9), acceptable (0.7 ≤ α < 0.8), questionable (0.6 ≤ α < 0.7), poor (0.5 ≤ α < 0.6) and unacceptable (<0.5) [16]. iii) External validity Canadian normative data for the SF-36 health questionnaire have been published [17] and were used as the reference standard for assessing the validity of the CAT. The Short Form 36 (SF-36) is a generic health related quality of life (HRQOL) questionnaire that includes eight multi-item scales. These include physical functioning (10 items), physical role limitations (4 items), emotional role limitations (3 items), bodily pain (2 items), social functioning (2 items), mental health (5 items), vitality (4 items), general health perceptions (5 items), and health transition (1 item) [18].
The domains of the SF-36 have also been aggregated to create composite measures of physical and mental health. The physical functioning, role physical, bodily pain, and general health scales are used to derive and aggregate physical health summary measure, and the role emotional, social functioning, mental health and vitality scales are aggregated to derive a mental health summary measure [19]. We analyzed the correlation of the overall CAT score, with the aggregate physical and mental summary health measure of the SF-36 using Pearson's correlation coefficient as a measure of the external validity of the test as a measure of the health status. All data were analyzed using STATA version 11.0 (Statacorp, TX, USA).

Results
Normative values, and psychometric properties of the CAT 525 non-COPD subjects have been recruited in the study. CAT questionnaires were missing for 25 subjects (4.8%), so 500 subjects were included in the analysis. There was no significant difference found in the demographic characteristics included and excluded subjects. As compared to the Canadian general population, individuals in the 40-49 year-old group were underrepresented in this cohort, while individuals in the older age groups appear to be over-represented. Almost two third of these individuals were retired from work and less than 10% were current smokers. The SF-36 scores for the included subjects were similar to those reported as normative data for the Canadian population [17]. The mean FEV 1 /FVC ratio was 0.77 (SD 0.49) with a mean predicted FEV 1 of 99.38% (SD 16.88%). The demographic characteristics of the subjects are summarized in Table 1.
The Cronbach alpha for the CAT was 0.79, suggesting a high internal consistency in this population. The overall correlation coefficient for the test with the physical component of the SF-36 was −0.48, and with the mental component was −0.38. The normative values, and distribution (by percentiles) for the overall CAT score by age group and sex is summarized in Table 2. The overall mean score was 6 (SD 5.09), and it was higher in females (6.43, SD 5.59) as compared to males (5.53, SD 4.45). The overall inter-quartile range was 2-8 (3-8 for males 2-9 for females). CAT scores were similar across the age strata from 40-79 (mean score range for each strata 5.54-6.37), but were higher for the group of subjects over 80 years of age (mean 7.58, SD 6.82) (Figure 2). Overall values ranged from 0 to 36.   (2) 43 (9) 85 (17) 82 (16) 18 (4)

Identification of predictors of high CAT scores
A score of 16 was found to be at the 95th percentile for the study population. 27 subjects (5.4%) were found to have a high CAT score (defined as > =16). The group comprised 8 of the 246 subjects who were non-smokers (3.25%), 14 of the 221 subjects who were ex-smokers (6.33%), and 5 of the 33 subjects who were current smokers ( .71) were found to be significantly associated with a high CAT score. When classified based on the suggested cut-off score of 10, 96 subjects (19.2%) were found to have an abnormal CAT score. In the adjusted analysis, an increase in the FEV 1 percentage predicted, existing cardiovascular disease and having physician-diagnosed asthma reported by the subject were the only factors found to be significantly associated with a high CAT score. Musculoskeletal disease was not a significant effect in this analysis. Smoking was not found to be associated with a high CAT score (OR 1.64, 95% CI 0.74, 3.65). The results of the analyses are summarized in Table 3. 254 subjects did not have any of the above-mentioned comorbidities. The mean CAT score in this subset of subjects was 5.11 (SD 4.1).

Discussion
This is one of the first studies that describe the normative values of the CAT questionnaire in a populationbased sample of healthy subjects and well characterized in terms of physiological measurements to confirm the presence or absence of COPD. This will serve as a valuable benchmark for norm-based future comparisons.
The overall mean score for the CAT (6, SD 5.09) in this population was within the range of values found for the Arabic, Japanese, and Turkish versions tested in large non-COPD populations (mean scores of 5.4, 6.3 and 8.07 respectively) [8,13]. Women and older individuals (over 80 years of age in the present study) had higher CAT scores, consistent with the results from the BREATHE study [8]. The 95th centile for overall CAT score for this study (score of 16) was lower than that found for the Arabic and Turkish versions (21 and 28, respectively) [8]. The relationship observed between CAT score and smoking status, suggests, that even in an unselected population, the CAT might be a useful screening tool for the assessment of respiratory health. In the adjusted analysis, a prior diagnosis of asthma by a Two definitions of a high CAT score were used, the first was ≥95th percentile, and the second was ≥ 10, which is considered as the conventional cut-off.
physician appeared to have the strongest correlation with a high CAT score, followed by the association with a current smoking status, and these results possibly reflect the specificity of the questionnaire for respiratory health. The statistically significant association of a high score with musculoskeletal disease suggests that CAT scores may be slightly influenced by limitation of activity due to other conditions, although this effect disappeared when the participants were characterized into higher and lower CAT scores. The loss of the association between high CAT scores and smoking status, when a cutoff of 10 was used, suggests that a lower cut-off may result in a greater proportion of abnormal scores being a consequence of non-respiratory conditions, and this may be reflected by the presence of an association with patient-reported cardiovascular disease. However, further studies with an attempt to anchor the score to physiological, functional and chest radiological imaging parameters in the normal population would be needed to derive the optimal cut-off for the score when used in this way.
Results from the testing of the psychometric properties of the questionnaire were consistent with the available literature, suggesting the CAT has good internal and external validity. The lack of a strong correlation of the CAT score with both, the physical and the mental component of the SF-36, suggests that the CAT may be more suited to assessing respiratory, and not overall quality of life.
Our study has several strengths. The population sampling strategy based on the international study BOLD, the low missing data, and the consistency of the SF-36 scores obtained from the subjects in this study, suggest that selection biases may have been minimized. The study by Nishimura et al. [13] described normative values for the CAT score in healthy industrial workers from Japan, thereby potentially being biased due to the "healthy worker effect" [20]. The study also did not use post-bronchodilator spirometric values for the diagnosis of COPD, thereby potentially misclassifying individuals with, and without COPD. Our study was a populationbased study, and used post-bronchodilator values, as recommended by the guidelines [3].
When compared to the overall Canadian population, individuals in the age group of 40-49 years appear to have been under-represented. Also, a significant proportion of the individuals in our study were retired from work because of the age group selected, and the proportion of individuals with comorbidities also reflects the ageing population, unlike the study by Nishimura et al. [13] which comprised a working population. Finally, the subjects studied in our study were primarily Caucasian, with an underrepresentation of other ethnicities. These could represent potential selection biases, and are limitations of the study.

Conclusion
In conclusion, our study provides normative values for the CAT questionnaire from well characterized individuals in a general population, to be used for norm-based comparisons. We have shown that current smoking status, physician-diagnosed asthma and musculoskeletal disease are predictors of a higher CAT score. Consistent with available literature, the CAT had a high validity not only in COPD but also in this general population. Follow-up in the longitudinal CanCOLD study will be useful in assessing the optimal cut-off for score, and the association of changes in respiratory physiological and morphometric (CT Scan imaging) measurements with changes in health status as measured by the CAT questionnaire.