Increased methylation of lung cancer-associated genes in sputum DNA of former smokers with chronic mucous hypersecretion

Background Chronic mucous hypersecretion (CMH) contributes to COPD exacerbations and increased risk for lung cancer. Because methylation of gene promoters in sputum has been shown to be associated with lung cancer risk, we tested whether such methylation was more common in persons with CMH. Methods Eleven genes commonly silenced by promoter methylation in lung cancer and associated with cancer risk were selected. Methylation specific PCR (MSP) was used to profile the sputum of 900 individuals in the Lovelace Smokers Cohort (LSC). Replication was performed in 490 individuals from the Pittsburgh Lung Screening Study (PLuSS). Results CMH was significantly associated with an overall increased number of methylated genes, with SULF2 methylation demonstrating the most consistent association. The association between SULF2 methylation and CMH was significantly increased in males but not in females both in the LSC and PLuSS (OR = 2.72, 95% CI = 1.51-4.91, p = 0.001 and OR = 2.97, 95% CI = 1.48-5.95, p = 0.002, respectively). Further, the association between methylation and CMH was more pronounced among 139 male former smokers with persistent CMH compared to current smokers (SULF2; OR = 3.65, 95% CI = 1.59-8.37, p = 0.002). Conclusions These findings demonstrate that especially male former smokers with persistent CMH have markedly increased promoter methylation of lung cancer risk genes and potentially could be at increased risk for lung cancer.


Background
Chronic obstructive pulmonary disorder (COPD) is predicted to become the third leading cause of death worldwide by 2020 [1]. Prevalence is increasing both in developing and developed countries as a result of tobacco consumption [2,3], environmental exposures such as pollution and biomass fuel smoke [4,5] and the growing elderly population [6]. Clinically, COPD is defined by the presence of poorly reversible airflow obstruction, although this definition simplifies the complex causes and manifestations of the disease [7]. Chronic mucous hypersecretion (CMH), characterized by persistent mucous cell metaplasia in the epithelial layer and submucosal glands of the respiratory tract, is a clinically important COPD phenotype [8]. CMH leads to worse respiratory symptoms, greater susceptibility to respiratory infections, more frequent COPD exacerbations, and increased risk of mortality [9][10][11][12][13][14].
Numerous publications, and two recent meta-analyses, have determined that prior CMH significantly increases the risk for later lung cancer [15,16]. While smoking clearly contributes to both diseases, analyses controlling for smoking have demonstrated that the association between lung cancer and prior CMH is at least partially independent of smoking [15,16]. It is therefore plausible that CMH and lung cancer have some shared molecular pathology. Previous case-control studies of incident lung cancer assessing the same genes as in the current study demonstrated that promoter methylation of these genes is associated with lung cancer risk [17,18].
The goal of this study was to determine whether there was any association between CMH and prevalence of methylation of promoters in lung cancer-predictive genes in sputum DNA of smokers. Therefore, methylation specific PCR (MSP) was used to assess promoter methylation of eleven genes in sputum samples of smokers from the Lovelace Smokers Cohort (LSC). Replication was performed in smokers from the Pittsburgh Lung Screening Study (PLuSS).

Study populations
This study is approved by the Western Institutional Review Board (Olympia, WA; #20031684) and all subjects signed informed consent for their participation. The catchment area for the LSC was the Albuquerque, NM metropolitan area, comprising a population of approximately 850,000 persons. Inclusion criteria for entry into the current study were age 40 to 75 years, current or former cigarette smoking (with a minimum of 10 pack-years) upon entry into the study, and ability to understand English. The LSC disproportionately enrolled women ever-smokers to study the susceptibility to the development of smoking-related lung diseases since women are underrepresented in most such studies in the United States. Detailed characteristics of the LSC have been described elsewhere [19,20]. From the LSC cohort, 311 non-Hispanic white (NHW) individuals meeting the criteria for CMH were included along with 589 NHW current or former smoking controls. Current and former smoking was assessed by self-report at baseline concurrent with sputum sampling. Former smokers are those who have stopped smoking for at least 2 years prior to self-report.
Study participants for the replication cohort were from the Pittsburgh Lung Screening Study (PLuSS), a volunteer cohort established to investigate lung cancer biomarkers in an at-risk population of smokers which has previously been described [21,22]. From the total cohort (n = 3638), 490 NHW individuals (183 men and 307 women) had information allowing classification with respect to chronic mucous hypersecretion and had provided sputum for DNA isolation. Spirometric testing procedures have previously been described for both the PLuSS and the LSC [19,21].
Because a unifying definition for CMH was not available in both cohorts two criteria were used to define CMH: In the LSC, CMH was defined as present in participants that had self-reported cough productive of phlegm for at least 3 months per year for at least 2 consecutive years (ie. the standard definition of chronic bronchitis). In the PLuSS, CMH was defined as self-reported cough productive of phlegm as assessed by both a first and second questionnaire (with a median questionnaire interval of 3.5 years), and self-reported cough producing phlegm for "most days a week" or "several days a week" in the past year, as assessed by the second questionnaire.

Statistical analysis
Chi-square and Fisher's exact tests were used for the univariate analyses of categorical variables, while twosample t-tests and Kruskal-Wallis tests were used for continuous variables. For multivariable analyses of CMH, logistic regression was performed. Predictors included gene specific methylation prevalence, and also total methylation (continuous variable representing the sum of genes methylated within an individual). Additional predictors included age, education (dichotomized as at least high school or less than high school education), COPD status, sex, pack-years smoking, and current smoking status. When the LSC and PLuSS were combined for analyses adjustment for cohort was included. Model fitting iterations were performed with the R package glmulti using the small sample size corrected Akiake information criterion to determine best-fitting models [25]. All statistical analyses were performed in R version 2.12.0 or SAS version 9.2.

CMH is associated with higher prevalence of gene promoter methylation in smokers
The initial study was conducted in 900 NHW current and former smokers from the LSC with available sputum methylation data. At time of sputum collection, there were 311 smokers with and 589 smokers without CMH. In unadjusted analysis, prevalence of SULF2 methylation was significantly higher in those with CMH than without CMH (39 % and 30 % respectively, p < 0.01, Table 1). A replication study was performed in the PLuSS, comprised of 140 smokers with and 350 smokers without CMH, and in unadjusted analysis, prevalence of SULF2 methylation was significantly higher in those with CMH than those without CMH (40 % and 26 % respectively, p < 0.01, Table 2).
In adjusted analysis in the LSC, total methylation (defined as the cumulative prevalence of methylation for all 11 genes; see Methods) was significantly higher in smokers with CMH, as was methylation prevalence of SULF2, JPH3, and PCDH20 (p < 0.05, all analyses) (Table 3). Similarly, adjusted analysis in the PLuSS showed that total methylation was significantly higher in those with CMH, as was methylation prevalence of SULF2, p16, and PCDH20 (p < 0.05, all analyses) ( Table 3).
Analyses combining the two cohorts were also performed. In both unadjusted (Additional file 1: Table S1) and adjusted (Table 3) analysis in the combined cohorts, total methylation was higher in those with CMH than in those with an absence of CMH, as was methylation prevalence of SULF2, JPH3, p16 and PCDH20 (p < 0.01, all analyses). Additional factors associated with CMH were younger age, less education, having COPD, greater pack years, and current smoking (p < 0.01, all analyses, Additional file 1: Table S1). Additional modeling was performed that included two-way interaction terms for baseline COPD, pack years and methylation, total or individual gene for the combined cohort of LSC and PLuSS cohorts. These interaction terms were not significant for total methylation, Sulf-2, or PCDH20, each of which showed significant association with CMH within the LSC, the PluSS cohort, and the combination of both cohorts. These findings suggest methylation is an independent risk for CMH.

The association between CMH and gene promoter methylation is stronger in males
Univariate analysis revealed factors that were associated with higher methylation prevalence, which include male sex (p < 0.001) (Additional file 1: Table S2). Because of the observed sex differences in methylation prevalence, sex stratified analyses were performed in males and females. Total methylation was significantly associated with CMH in males in both the LSC and PLuSS cohorts (p < 0.01, both analyses) and when analysis was performed for the combined cohort (p < 0.001) ( Table 4). When individual genes were analyzed in males, SULF2, p16, and JPH3 were significantly associated in the LSC (p < 0.05, all analyses), while SULF2 and PCDH20 were significant in the PLuSS (p < 0.05). In the combined cohort, the prevalence of SULF2, JPH3, PCDH20, and p16 methylation were all significantly higher in males with CMH compared to males without CMH (p < 0.05, all analyses). Although the number of female participants was higher for both cohorts, in females, no significant associations were found for the individual cohort analyses, although higher SULF2 methylation prevalence was observed in analysis of the combined cohorts (p < 0.05).

The association between CMH and gene promoter methylation is stronger in former smokers
Current smoking status and pack years were controlled for in adjusted analyses (Tables 3 and 4); however, residual confounding remains a possibility, given that current smoking strongly influences CMH status (Tables 1 and 2). Therefore, stratified analyses were performed in current and former smokers. Adjusted stratified analysis revealed for both the LSC and PLuSS that total methylation was significantly higher in those with CMH who were former smokers (p < 0.05, all analyses) ( Table 5). Although the number of current smokers was greater in both cohorts, in current smokers total methylation was not significantly associated with CMH in either cohort or the combined analysis. In general, the associations between methylation  and CMH were less significant and demonstrated smaller effect sizes in current smokers (Table 5). Sex and smoking stratified analysis of the combined cohorts (combined to ensure adequate sample size for analysis) ( Table 6) revealed that the strongest associations between methylation and CMH were observed in male former smokers, with odds ratios for the individual genes ranging from 2.55 to 4.34. Despite 2-3-fold greater number of female participants in the LSC and PLuSS, only SULF-2 methylation was associated with CMH in females from the combined cohorts.

Sputum methylation is a sensitive and specific predictor of CMH in male former smokers
Receiver operator characteristic (ROC) curves were generated to assess the sensitivity and specificity of logistic regression models for discriminating CMH. Prior to generating ROC curves, modeling was performed to assess all combinations of predictors, including all 11 genes and covariates. The Akaike information content (AICc) [25] was used to select the models with an optimal trade-off between accuracy and complexity. Independently in both the LSC and the PLuSS, the best-fitting model was  a 3-gene model that included SULF2, JPH3, and p16 as predictors, as well as age, pack years, education, and COPD (data not shown). Therefore, using the combined sample from the LSC and PLuSS, ROC curves were generated using the 3-gene model, the full 11-gene model, and covariates-only model in male former smokers (Figure 1). Likelihood ratio tests confirmed that both the 3-gene and 11-gene models are significantly more discriminative than the covariates only model (p = 0.0002 and p = 0.002, respectively); however, the 3-and 11-gene models were not significantly different from each other (p = 0.29). Areas under the curve (AUC) were 0.74 and 0.80 for the 3-and 11-gene models, respectively, while the AUC was 0.55 for the covariates only model. Although sample sizes were small in cohort-stratified analyses of male former smokers, these analyses demonstrate that the increased discriminative power of the 3-gene model is observed in two independent cohorts (Additional file 1: Figure S1).

Discussion
This study demonstrates a significant association between CMH and prevalence of promoter methylation in sputum of lung cancer risk genes in two geographically distinct cohorts. This association was especially strong in males and in former smokers, and SULF2 was the most consistently associated gene. Importantly, the overall association between CMH and methylation, and the specific effects of sex and smoking status, were observed independently in both cohorts. Combining the two cohorts strengthened the statistical significance of these associations. The central finding of our study is that male former smokers with unresolved CMH may be at an increased risk of lung cancer. Given that 50% of persons diagnosed with lung cancer are former smokers, prospective studies evaluating the methylation status of former smokers with CMH who subsequently develop lung cancer are needed [26].
The eleven genes examined in this study were selected based on prior evidence that they are associated with lung cancer risk [17,18]. Therefore, increased prevalence of methylation of these genes may predict lung cancer among subjects with CMH. These gene promoters have all been shown to be methylated in tumors [27,28], and  are proposed to represent an expanding field of precancerous epigenetic changes in the aerodigestive tract of smokers [17,29]. This hypothesis is supported by the observation that the methylation prevalence of these gene promoters increases as the time to lung cancer diagnosis decreases [17]. Mounting evidence indicates that these changes are causal for tumor initiation [30][31][32][33].
The association between methylation and CMH was markedly stronger in males than in females (Table 4). Univariate analysis of males and females in both cohorts (Additional file 1: Tables S3 and S4) reveals that females with CMH are significantly younger than female controls in the LSC; however this was not true in the PLuSS. Additionally age was a covariate in all adjusted analyses and thus is unlikely to account for the lack of association between methylation and CMH in females. This apparent protective mechanism in females warrants further study. The association between methylation and CMH was also stronger in former than in current smokers ( Table 5). The increase in effect size in former smokers may be due to several reasons: (1) the CMH phenotype in former smokers may not be confounded by cough and phlegm caused by irritation due to current smoking; (2) in susceptible smokers, CMH that persists in spite of smoking cessation may represent a phenotype with a more distinct molecular pathology; (3) The association between CMH and gene promoter methylation may be stronger with age. In the LSC and PLuSS cohorts, former smokers were significantly older than current smokers (mean age difference 4.2 years, data not shown). This age difference between former and current smokers also likely explains the puzzling observation that current smokers have lower overall methylation compared to former smokers (Additional file 1: Table S2); current smokers are younger, and younger age is associated with less total methylation in these lung cancer risk genes.
Numerous studies have demonstrated that prior CMH significantly increases the risk for later development of lung cancer (reviewed in [15,16]). Assessment of the latency period between diagnosis of CMH and diagnosis of lung cancer has shown that this risk increases with time since diagnosis of CMH [34]. In one study [34], the odds ratio nearly quadrupled at latency >15 years compared to latency 1-5 years. Importantly, this suggests that CMH may serve as a precursor to lung carcinogenesis [34]. We hypothesize that the increased prevalence for methylation of the lung cancer risk genes seen in this study may help explain the epidemiological link between CMH and lung cancer. Further studies are needed to establish a direct link between gene methylation and lung cancer. Interestingly, while SULF2, p16, JPH3, and PCDH20 all demonstrate evidence for association with CMH in the current study, a previous study determined that GATA4 promoter methylation was associated with airflow obstruction [35]. These findings suggest that major differences exist in the genes affected by aberrant promoter methylation in distinct COPD sub-phenotypes. This is consistent with the major pathophysiological differences that underlie emphysema and chronic mucous hypersecretion [36], and suggests the role basal cell hyperplasia may play in development of lung cancer [37].
Of the 11 genes analyzed, SULF2 demonstrated the strongest association with CMH. SULF-2 is an extracellular enzyme that catalyzes the hydrolysis of 6-O-sulfo groups from heparan sulfate polysaccharides [38][39][40]. Heparan sulfate proteoglycans (HSPGs) are widely distributed on cell membranes and the extracellular matrix, and serve as coreceptors for many growth factors and cytokines [41] and the position of 6-O sulfates is of particular importance for ligand binding [38][39][40]. Epigenetic inactivation of SULF2, either by siRNA treatment or promoter methylation, activates numerous type I interferon (IFN)-inducible genes [42]. It was proposed that silencing of SULF2 prevents the removal of sulfate groups from IFN-binding sites, which may preserve either the binding affinity or bioavailability of interferons leading to increased transcription of multiple IFN-inducible genes [42]. It is plausible that CMH, caused by metaplastic mucous cells that are sustained due to dysregulated cell death mechanisms that involve IFN signaling [43][44][45], creates an inflammatory milieu which causes methylation of SULF2. In turn, the type 1 interferon response induced by methylation of SULF2 may help to perpetuate the inflammation associated with CMH. This is the first report of epigenetic changes in the airways of individuals with CMH. Strengths of the study include the use of the large, well-characterized LSC for the initial phase of study and excellent replication of all main findings in the geographically distinct PLuSS. We chose the standard definition for chronic bronchitis in the LSC and a definition that most closely captured the standard clinical definition of chronic bronchitis in the PLuSS. While the differences in questionnaires used to define CMH could be considered a limitation in the study, the definition for CMH was applied to PLuSS subjects prior to any data analysis and was not subsequently modified. We propose that this approach improves the rigor of our validation. Replication of these findings supports the robustness of these markers for CMH and suggests that they are useful in defining a subset of subjects with CMH who could benefit from computed tomography (CT) screening for lung cancer [46]. Indeed, low cost, gene-specific methylation screening assays could be incorporated into clinical practices for patients suspected to be at risk for lung cancer.

Conclusions
Especially male former smokers with persistent chronic mucous hypersecretion have markedly increased promoter methylation of lung cancer risk genes in cell obtained by sputum collection. These smokers may be at increased risk of lung cancer and may benefit from further tests for lung cancer, such as CT screening.