Genetic susceptibility for chronic bronchitis in chronic obstructive pulmonary disease

Background Chronic bronchitis (CB) is one of the classic phenotypes of COPD. The aims of our study were to investigate genetic variants associated with COPD subjects with CB relative to smokers with normal spirometry, and to assess for genetic differences between subjects with CB and without CB within the COPD population. Methods We analyzed data from current and former smokers from three cohorts: the COPDGene Study; GenKOLS (Bergen, Norway); and the Evaluation of COPD Longitudinally to Identify Predictive Surrogate Endpoints (ECLIPSE). CB was defined as having a cough productive of phlegm on most days for at least 3 consecutive months per year for at least 2 consecutive years. CB COPD cases were defined as having both CB and at least moderate COPD based on spirometry. Our primary analysis used smokers with normal spirometry as controls; secondary analysis was performed using COPD subjects without CB as controls. Genotyping was performed on Illumina platforms; results were summarized using fixed-effect meta-analysis. Results For CB COPD relative to smoking controls, we identified a new genome-wide significant locus on chromosome 11p15.5 (rs34391416, OR = 1.93, P = 4.99 × 10-8) as well as significant associations of known COPD SNPs within FAM13A. In addition, a GWAS of CB relative to those without CB within COPD subjects showed suggestive evidence for association on 1q23.3 (rs114931935, OR = 1.88, P = 4.99 × 10-7). Conclusions We found genome-wide significant associations with CB COPD on 4q22.1 (FAM13A) and 11p15.5 (EFCAB4A, CHID1 and AP2A2), and a locus associated with CB within COPD subjects on 1q23.3 (RPL31P11 and ATF6). This study provides further evidence that genetic variants may contribute to phenotypic heterogeneity of COPD. Trial registration ClinicalTrials.gov NCT00608764, NCT00292552 Electronic supplementary material The online version of this article (doi:10.1186/s12931-014-0113-2) contains supplementary material, which is available to authorized users.

Background COPD, a leading cause of morbidity and mortality, is characterized by persistent airflow limitation and phenotypic heterogeneity. While cigarette smoking is a major risk factor for COPD, the response to cigarette smoke is highly variable [1]. Chronic bronchitis (CB) and emphysema represent two classic phenotypes of COPD [2]. However, CB, which is defined clinically by chronic cough and phlegm, can occur in the absence of COPD [3]. Some studies have suggested that CB and emphysema have different genetic determinants [4,5]. CB has been reported to be associated with frequent respiratory exacerbations, increased respiratory symptoms, poor quality of life, and even increased mortality [6][7][8].
Although candidate gene testing and linkage analysis have been used to search for CB-related genetic determinants in selected populations [9,10] and recently a genome-wide association meta-analysis has reported genetic variants associated with chronic mucus hypersecretion mainly in subjects from the general population [11], genome-wide association studies (GWAS) of CB within COPD subjects have not been reported. Our primary hypothesis was that genetic variants would be associated with COPD-related CB. We also hypothesized that genetic heterogeneity exists according to the presence or absence of CB within COPD subjects. We addressed these hypotheses by comparing COPD subjects with CB to smokers with normal spirometry and to COPD subjects without CB as control groups.

Study cohorts
Subjects were current and former smokers from three studies: the non-Hispanic whites (NHWs) from the COPDGene Study (NCT00608764 at, https://clinicaltrials. gov); GenKOLS (Bergen, Norway); and the Evaluation of COPD Longitudinally to Identify Predictive Surrogate Endpoints (ECLIPSE, NCT00292552 at, https://clinicaltrials. gov). All subjects had self-described European white ancestry. The design and procedures for each participating study have been previously described [12][13][14]. For supplementary analysis, the African Americans (AAs) of the COPDGene Study were included. Institutional review board approval was obtained at each participating clinical center; all subjects provided written informed consent. This study was approved by the Partners HealthCare Institutional Review Board (COPDGene, 2007P000554; GenKOLS, 2009P000790; ECLIPSE, 2005P002467).

Variable definitions
CB was defined as chronic productive cough for 3 months in each of 2 successive years [15]. CB COPD cases were defined as having both CB and COPD of at least spirometry grade 2 (post-bronchodilator FEV 1 /FVC <0.7 and FEV 1 < 80% predicted), defined by the Global initiative for chronic Obstructive Lung Disease (GOLD 2-4) [16]. For CB COPD cases, primary analysis used current or former smokers with normal spirometry (post-bronchodilator FEV 1 /FVC ≥0.7 and FEV 1 ≥ 80% predicted) as a control group. A secondary analysis was performed using COPD subjects without CB as controls to explore genetic heterogeneity within COPD subjects. Additionally, we performed GWAS of COPD subjects without CB relative to smoking controls for comparison to our results in COPD CB subjects ( Figure 1). Additional variable definitions for complementary analyses are available in an online supplement.

Genotyping quality control and imputation
Genotyping was performed using Illumina platforms [HumanOmniExpress for the COPDGene cohort, the HumanHap 550 (V1, V3, and Duo) for the GenKOLS cohort, and HumanHap 550V3for the ECLIPSE cohort; Illumina, Inc., San Diego, CA]. Genotype imputation on the COPDGene cohorts was performed using MaCH [17] and minimac [18] using 1000 Genomes [19] Phase I v3 European (EUR) reference panels or cosmopolitan Figure 1 Genome-wide association study design for chronic bronchitis. Definition of abbreviations: CB = chronic bronchitis; COPD = chronic obstructive pulmonary disease; GOLD = Global initiative for chronic Obstructive Lung Disease. GOLD 2-4 was defined as having a post-bronchodilator FEV1/FVC < 0.7 and FEV1 < 80% predicted. Normal spirometry was defined as having a post-bronchodilator FEV1/FVC ≥ 0.7 and FEV1 ≥ 80% predicted. reference panels for NHWs and AAs, respectively. Details on genotyping quality control and imputation for the GenKOLS and ECLIPSE cohorts have been previously described [5,14,[20][21][22][23]. If variants passed genotyping or imputation quality control in all three cohorts, they were included for analysis.

Statistical analysis
Logistic regression analysis of SNPs under an additive model of inheritance with case-control status as the outcome was performed in each cohort with adjustment for age, gender, pack-years of cigarette smoking and genetic ancestry-based principal components using PLINK 1.07 [24], as previously described [21][22][23]. Imputed genotypes were analyzed in a similar manner, using SNP dosage data in PLINK 1.07 [24]. We performed fixed-effects meta-analysis [25] using METAL (version 2011-3-25) [26] and R 2.15.1 (www.r-project.org) with the metapackage. Heterogeneity was assessed by calculating both I 2 [27] and P values for Cochran's Q. In genomic regions with evidence of genetic heterogeneity, we also used a modified random-effects model optimized to detect associations under heterogeneity since the fixed effects model is based on inverse-variance-weighted effect size [28]. Genomic inflation factors [29] were calculated using GenABEL [30]. We used LocusZoom [31] to generate regional association plots, using the 1000 Genomes EUR reference data to calculate linkage disequilibrium (LD).
We used permutation testing [23] to assess differences in ORs of previous known genome-wide significant SNPs between two meta-analyses.

GWAS of CB COPD relative to smokers with normal spirometry
Baseline characteristics of each of the three primary cohorts are summarized in Table 1.
For the primary analysis of CB COPD relative to smokers with normal spirometry, the combined GWAS of three cohorts included 1,662 CB COPD cases and 3,520 controls. The quantile-quantile (Q-Q) plot showed no evidence of significant residual population stratification ( Figure 2A; lambda = 1.03). Figure 3A shows a genome-wide significant association within the previously reported COPD susceptibility genome-wide significant region on chromosome 4q22.1 in FAM13A and a second genome-wide significant association in a novel region on 11p15.5. The results for the most significant SNPs at each of these loci are listed in Table 2. Figure 4 displays the regional association plots for these two regions. The top 12 SNPs in the meta-analysis were located on 4q22.1 (FAM13A) and either identical to, or in strong LD (r 2 ≥ 0.97) with, the top SNPs previously described in GWASs of pulmonary function [32,33] and COPD [21][22][23].

GWAS of CB COPD relative to COPD subjects without CB
A GWAS of CB within COPD subjects from three studies included the same number of COPD CB cases and 3,777 COPD subjects without CB as a control group. Table 3 showed baseline characteristics of COPD subjects, and there is a corresponding Q-Q plot in Figure 2B (lambda = 1.01). We found a novel suggestive locus on 1q23.3, which did not reach genome-wide significant levels (rs114931935, P = 4.99 × 10 -7 , Table 4 and Figure 3B). This locus includes ribosomal protein L31 pseudogene 11 (RPL31P11) and activating transcription factor 6 (ATF6) ( Figure 5). Data are presented as mean (SD) or percentage, as appropriate. Definition of abbreviations: CB chronic bronchitis, NHW non-Hispanic white.
Since the GWAS in COPD Gene NHWs for COPD with CB versus COPD without CB identified a genome-wide significant SNP, rs12692398 on 2p25.1, we performed a meta-analysis of two studies (COPDGene NHWs and GenKOLS), which also demonstrated the same SNP as a genome-wide significant SNP (Additional file 1: Table S1 and Additional file 1: Figure S1). It is located within cystin-1 (CYS1), encoding a cilia-associated protein. This SNP did not demonstrate evidence for association to CB within ECLIPSE COPD cases.

Complementary analyses
To explore whether our results were similar when including an additional racial group, a supplemental metaanalysis of four cohorts by adding AAs of COPDGene was performed for CB COPD relative to smoking Figure 2 The quantile-quantile plots for the three-cohort meta-analysis including 1000 Genomes project imputed data for (A) COPD subjects with chronic bronchitis (CB) versus smoking controls and (B) CB versus no CB within COPD subjects, after adjustment for age, sex, pack-years of cigarette smoking and genetic ancestry using principal components. Figure 3 Manhattan plots of -log 10 P values for meta-analysis of three cohorts for (A) COPD subjects with chronic bronchitis (CB) versus smoking controls and (B) CB versus no CB within COPD subjects, after adjustment for age, sex, pack-years of cigarette smoking and genetic ancestry using principal components. controls. Additional file 1: Table S2 shows the baseline characteristics of AAs of COPDGene. The meta-analysis including 1,844 cases and 5,269 controls revealed similar results to those of three cohorts, with the exception of SNPs in both CHID and AP2A2, which were excluded because of their rarity in AA subjects (minor allele frequency < 0.01). The novel top SNP, rs34391416 (EFCAB4A), was genome-wide significant (OR = 1.93, P = 2.66 × 10 -8 ).
Since CB was present in some of our smoking controls, a GWAS of CB COPD versus smoking controls without CB (n = 3,101) was performed for each of our three cohorts and then meta-analyzed. These results were similar, although the novel top SNP on 11p15.5 was slightly reduced in statistical significance (rs34391416, OR = 1.98, P = 6.50 × 10 -8 ). However, a meta-analysis of four cohorts including AAs of COPDGene (n = 4,628) showed genome-wide significance of the same SNP (OR = 1.98, P = 2.76 × 10 -8 ). Baseline characteristics of smoking controls without CB were summarized in Additional file 1: Table S3.
Because COPD subjects with CB were more likely to be current smokers, complementary meta-analyses were performed with adjustment for current smoking status as well as age, gender, pack-years of cigarette smoking and genetic ancestry-based principal components. In metaanalyses using smokers with normal spirometry as a control group, FAM13A SNPs remained genome-wide significant. One of the previously reported COPD risk loci, 15q25, was nearly genome-wide significant (P = 6.58 × 10 -8 ). However, the novel SNP on 11p15 (rs34391416) was not genomewide significant (P = 5.25 × 10 -7 in three Caucasian cohorts and P = 2.60 × 10 -7 in four cohorts including AAs, Additional file 1: Table S4). On the other hand, a meta-analysis using COPD subjects without CB as a control group, with adjustment for current smoking status, provided lower (but not genome-wide significant) P values of top SNPs from the secondary analysis of COPD with CB vs. COPD without CB (Additional file 1: Table S5).
We assessed the top SNPs of CB COPD susceptibility relative to smokers with normal spirometry (the primary A B Figure 4 Local association plots for significant loci in the meta-analysis of cases with chronic bronchitis and COPD versus smoking control subjects in COPDGene non-Hispanic whites, GenKOLS, and ECLIPSE. A. rs2869967 on chromosome 4q22.1. B. rs34391416 on 11p15. The x-axis is chromosomal position, and the y-axis shows the -log10 P value. The most significant SNP at each locus is labeled in purple, with other SNPs colored by degree of linkage disequilibrium (r2). Plots were created using LocusZoom. meta-analysis) in the results of the secondary metaanalysis of CB vs. no CB within COPD subjects. The novel SNPs on 11p15.5 were nominally significant (P < 0.01), whereas SNPs near FAM13A were not significant (P > 0.1) (Additional file 1: Table S6). Clinical and radiological characteristics were compared according to genotypes of rs34391416 among all COPD-Gene NHW subjects (Additional file 1: Table S7). There were significant differences in parameters related to airway disease, including airway wall area% on inspiratory chest CT scans and gas trapping on expiratory CT. There were no differences in emphysema severity or distribution related to this SNP.
Since the meta-analysis of CB COPD relative to smoking controls showed FAM13A as the top gene, we performed additional analyses to ascertain whether SNPs near FAM13A had different levels of statistical significance between COPD with CB and COPD without CB. A meta-analysis of GWASs for COPD subjects without CB relative to smoking controls ( Figure 1) also showed FAM13A as the top gene, which was followed by HHIP and IREB2 (Additional file 1: Table S8 and Additional file 1: Figure  S2). ORs and P values of previously known COPD risk alleles among our results from meta-analyses for CB COPD or COPD without CB are summarized in Table 5. Permutation testing revealed that differences of ORs between our two meta-analyses were statistically significant at four SNPs in FAM13A.

Discussion
Our GWAS meta-analysis of three studies of COPD subjects with CB relative to smoking controls not only reconfirmed previously known genome-wide significant SNPs in FAM13A related to lung function [32,33] and COPD [21][22][23], but also revealed a novel locus on 11p15.5, including EFCAB4A, CHID1, and AP2A2.  Proteins encoded by one or more of these three genes could be involved in CB. Interestingly, this new region is located next to MUC6 and MUC2 ( Figure 5) [34]. Thus, it is also possible that this genomic region influences regulation of mucin genes to alter susceptibility to CB. EFCAB4A encodes a protein involved in store-operated Ca 2+ entry. Intracellular Ca 2+ was reported to regulate MUC2 expression [34,35] and mucin secretion from airway goblet cells [36]. In addition, a study demonstrated increased intracellular Ca 2+ levels in lymphocytes of COPD patients, which correlated positively with the spirometric grade of COPD [37]. Gene expression microarray analysis of human bronchial epithelial cells identified overexpression of EFCAB4A during mucociliary differentiation [38]. While quantitative RT-PCR revealed high expression of EFCAB4A in lung [39], a role for EFCAB4A in CB remains to be defined.
Even though SNPs near CHID1 in the meta-analysis of three GWASs did not show genome-wide significance, rs147862429 was the most genome-wide significant in a GWAS of COPDGene NHWs, with a P value of 2.90 × 10 -10 . CHID1 encodes a saccharide-and LPS-binding protein, also called stabilin-1 interacting chitinase-like protein (S1-CLP), with possible roles in pathogen sensing and endotoxin neutralization [40]. It is expressed in cells of monocytic, T lymphocyte, B lymphocyte, and epithelial origin, and it is up-regulated by the Th2 cytokine interleukin-4 and dexamethasone in macrophages [41].
Other human chitinase and chitinase-like proteins were previously suggested to play a role in the development of COPD [42]. Chitotriosidase (CHIT1) levels were elevated in the bronchoalveolar lavage fluid of smokers with COPD [43]. A chitinase-like protein, commonly known as YKL-40, was also increased in the lungs of COPD patients [44]. A recent study demonstrated genetic associations between chitinase gene variants and lung function level and rate of decline in COPD patients from the Lung Health Study [45]. Therefore, CHID1 may be involved in the pathogenesis of CB.
AP2A2 encodes adaptor protein complex 2 subunit alpha-2, which has been shown to participate in the endocytosis of clathrin-coated vesicles in interacting with epsin-1 [46] and receptor endocytosis with SHCtransforming protein 1 [47]. One study demonstrated long-range interactions between the MUC2 promoter and the adjacent AP2A2 gene by using quantitative chromosome conformation capture (q3C) [48]. Although human respiratory tract mucus contains mainly MUC5AC and MUC5B along with smaller amounts of MUC2, the distribution of MUC2 variable number tandem repeat (VNTR) alleles was reported to be different between asthmatics and non-asthmatics [49]. A followup study demonstrated relatively strong LD between SNPs in MUC2 and MUC5AC [50]. Therefore, AP2A2, either alone or through interactions with MUC2, may have a potential role in CB pathogenesis. In the primary meta-analysis of CB COPD relative to smoking controls, we found the strongest signal within FAM13A rather than the other known COPD susceptibility genes, and permutation testing confirmed that ORs of FAM13A SNPs were significantly higher than those for non-CB COPD. While COPD is a complex disease with marked phenotypic heterogeneity, most previous genetic studies have dealt with COPD subjects as one homogeneous group [20][21][22]. The current study suggests that previously identified COPD risk alleles might have different effects on the development of different COPD subtypes.
Although our secondary meta-analysis of CB COPD relative to COPD without CB within the COPD population failed to demonstrate genome-wide significant SNPs, the fourth most significant SNP, rs2298019, was previously identified as an expression quantitative trait locus (eQTL) for ATF6 in lung tissue [51], with the risk allele associated with decreased expression. ATF6 plays a major role in transcriptional repression of endogenous cystic fibrosis transmembrane conductance regulator (CFTR) under endoplasmic reticulum stress [52] and is thought to be a potential therapeutic target for cystic fibrosis (CF) [53]. In addition to CF, suppressed CFTR function has been reported in cigarette smokers and COPD patients without CF [54,55]. Recently, roflumilast, approved to reduce COPD exacerbations in COPD patients with CB, has been reported to activate CFTR [56]. Since ATF6 is closely connected with CFTR, genetic variants of ATF6 may play a role in the pathogenesis of CB.
We found that SNPs in another gene (CYS1) on 2p25.1 demonstrated suggestive associations for CB. CYS1 is enriched in the ciliary axoneme, and high expression in the kidney and weak expression in the lung were reported [57]. While the top SNP, rs12692398 of CYS1, reached the genome-wide significance threshold in both a GWAS of only COPDGene NHWs and a meta-analysis of COPDGene NHWs and GenKOLS, it lost significance in the metaanalysis of all three cohorts (P = 1.66 × 10 -4 ). It is unclear why the association evidence for CB of this genomic region within ECLIPSE was negative. In the meta-analysis of three cohorts, the other two SNPs of CYS1, rs13000481 and rs4574084 showed P values of 1.74 × 10 -5 and 2.61× 10 -5 , respectively, and LD between these two SNPs is high (0.94).
Our study has several limitations. First, we have not identified the functional genetic variants within our association regions. Nevertheless, we found significant differences in radiological parameters related to airway wall thickness according to genotypes of the novel top SNP, rs34391416, within COPDGene. These CT parameters have been frequently used as objective indicators of airway disease [6,58]. Interestingly, there were no differences in emphysema severity or distribution according to this SNP genotype. Further studies will be required to identify the functional genetic variants within this region and to determine which gene that they influence. Second, we have not performed any independent replication, although this analysis was a meta-analysis of three GWASs. However, a supplemental meta-analysis of four cohorts by adding COPDGene AAs also showed similar results as those of three cohorts. Third, CB was present in some of our smoking controls. However, an additional meta-analysis of three GWASs of CB COPD versus smoking controls without CB showed similar results.

Conclusions
We have identified a novel locus on 11p15.5, which includes several biologically plausible candidates (EFCAB4A, CHID1 and AP2A2) as potential CB susceptibility genes. We have also found significantly increased effect sizes of FAM13A SNPs in COPD subjects with CB compared to those without CB. Although our secondary GWAS of CB versus no CB within COPD subjects did not show genomewide significant SNPs, a locus including ATF6 should be explored for its related functional consequences. This study supports the concept that different genetic susceptibility contributes to phenotypic heterogeneity within COPD.

Additional file
Additional file 1: Variable definitions. Additional analysis methods. Table S1. Top results of the meta-analysis for COPD subjects with chronic bronchitis (CB) versus COPD subjects without CB in COPDGene non-Hispanic white and GenKOLS cohorts. Table S2. Baseline characteristics of COPD subjects with CB and smokers with normal spirometry as a control group in African Americans of COPDGene cohort. Table S3. Baseline characteristics of COPD subjects with CB and smoking controls without CB. Table S4. Top results of the two meta-analyses for COPD subjects with CB versus smokers with normal spirometry, including current smoking adjustment. Table S5. Top results of three Caucasian cohorts meta-analyses for COPD subjects with CB versus without CB, including current smoking adjustment. Table S6. Assessment of top results from COPD with CB versus smokers with normal spirometry within the meta-analysis for COPD subjects with CB versus without CB. Table S7. Clinical and radiological characteristics according to genotypes of rs34391416 among subjects with COPD and smoking controls among non-Hispanic whites of COPDGene. Table S8. Top results of three Caucasian cohorts meta-analyses for COPD subjects without CB versus smokers with normal spirometry. Figure S1. Local association plots for significant loci for the meta-analysis of COPD subjects with CB versus COPD subjects without CB in COPDGene non-Hispanic whites and GenKOLS. The x-axis is chromosomal position, and the y-axis shows the -log10 P value. The most significant SNP at each locus is labeled in purple, with other SNPs colored by degree of linkage disequilibrium (r2). Plots created using LocusZoom. Figure S2. (A) The quantile-quantile plot and (B) Manhattan plot of -log10 P values for the three-cohort meta-analysis including 1000 Genomes project imputed data for (A) COPD subjects without CB versus smoking controls after adjustment for age, sex, pack-years of cigarette smoking and genetic ancestry using principal components.
Abbreviations AA: African American; CB: Chronic bronchitis; ECLIPSE: Evaluation of COPD longitudinally to identify predictive surrogate endpoints; GOLD: Global initiative for chronic obstructive lung disease; GWAS: Genome-wide association study; NHW: non-Hispanic white; SNP: Single nucleotide polymorphism.