Genes and pathways underlying susceptibility to impaired lung function in the context of environmental tobacco smoke exposure

Background Studies aiming to assess genetic susceptibility for impaired lung function levels upon exposure to environmental tobacco smoke (ETS) have thus far focused on candidate-genes selected based on a-priori knowledge of potentially relevant biological pathways, such as glutathione S-transferases and ADAM33. By using a hypothesis-free approach, we aimed to identify novel susceptibility loci, and additionally explored biological pathways potentially underlying this susceptibility to impaired lung function in the context of ETS exposure. Methods Genome-wide interactions of single nucleotide polymorphism (SNP) by ETS exposure (0 versus ≥1 h/day) in relation to the level of forced expiratory volume in one second (FEV1) were investigated in 10,817 subjects from the Dutch LifeLines cohort study, and verified in subjects from the Swiss SAPALDIA study (n = 1276) and the Dutch Rotterdam Study (n = 1156). SNP-by-ETS exposure p-values obtained from the identification analysis were used to perform a pathway analysis. Results Fourty Five SNP-by-ETS exposure interactions with p-values <10−4 were identified in the LifeLines study, two being replicated with nominally significant p-values (<0.05) in at least one of the replication cohorts. Three pathways were enriched in the pathway-level analysis performed in the identification cohort LifeLines, i.E. the apoptosis, p38 MAPK and TNF pathways. Conclusion This unique, first genome-wide gene-by-ETS interaction study on the level of FEV1 showed that pathways previously implicated in chronic obstructive pulmonary disease (COPD), a disease characterized by airflow obstruction, may also underlie susceptibility to impaired lung function in the context of ETS exposure. Electronic supplementary material The online version of this article (doi:10.1186/s12931-017-0625-7) contains supplementary material, which is available to authorized users.


Background
Detrimental effects of environmental tobacco smoke (ETS) exposure on the level of lung function have been shown in a number of studies. [1][2][3] Moreover, polymorphisms in several genes such as the glutathione Stransferases (GSTs), [4,5] and ADAM33, [6] interact with ETS exposure, thereby negatively affecting the level and decline of lung function. In other words, ETS exposure has differential effects in subjects carrying mutant alleles compared to wild type alleles. Thus far, studies assessing interactions between genetic variants and ETS exposure were driven by a-priori knowledge of biological pathways, for example detoxification of noxious particles and gases by proteins such as the glutathione Stransferases.
In contrast, genome-wide interaction (GWI) studies are hypothesis-free and may yield novel loci in addition to those already known to be associated with impaired lung function directly (i.e. in genome-wide association studies without taking exposure into account). [7] Findings from such studies may provide novel insights in molecular pathways underlying disease pathogenesis, e.g. the development of chronic obstructive pulmonary disease (COPD) characterized by airflow obstruction. In this first genome-wide gene-by-ETS exposure study, we aimed to identify novel susceptibility loci, and additionally explored biological pathways potentially underlying this susceptibility to impaired forced expiratory volume in one second (FEV 1 ) in the context of ETS exposure during adulthood.

Cohort and measurements
Subjects from the baseline investigation of the LifeLines cohort study (2006)(2007)(2008)(2009)(2010)(2011), a multi-disciplinary prospective population-based cohort study examining health and health-related behaviour of persons living in the North East region of the Netherlands, were included as identification cohort. [8,9] To verify our initial findings we included subjects from two independent general population-based cohorts, the Swiss SAPALDIA study, [10] and the Dutch Rotterdam Study. [11] More detailed description of the cohorts, spirometric measurements and genotyping can be found in the online supplement (Additional file 1: Methods).

Exposure assessment
In the LifeLines and SAPALDIA studies ETS exposure was determined by the response to the question "how many hours per day are you exposed to other person's tobacco smoke?". In the Rotterdam study the question was similar, yet more specifically focused on exposure at home "at home, how many hours per day are you exposed to other person's tobacco smoke?". In all three cohorts, subjects were classified as ETS exposed if they reported at least 1 h/day (≥1 h/day) of exposure to other person's tobacco smoke, and as non-exposed when selfreported ETS exposure was 0 h/day. Subjects that reported between 0 and 1 h/day of ETS exposure were excluded in order to have a clear exposure contrast.

SNP-by-ETS exposure interactions
First, interactions of 227,981 genotyped SNPs with daily ETS exposure and their association with FEV 1 were assessed in the identification cohort LifeLines. SNPs were tested in an additive genetic model including SNP, ETS exposure and the SNP-by-ETS exposure interaction. All models were adjusted for sex, age, height, ever smoking and pack years using linear regression models in the software package PLINK, version 1.07. [12] SNPs with pvalues for the SNP-by-ETS interaction <10 −4 were taken further for verification in the two independent cohorts, the SAPALDIA study and the Rotterdam Study (Additional file 1: Methods).
Finally SNP-by-ETS interaction effects from the three cohorts for the SNPs identified in the identification analysis were meta-analyzed using fixed effects models with effect estimates weighted by the inverse of the standard errors using software package METAL. [13] We selected SNP-by-ETS interactions with the same direction of interaction in all three cohorts and considered interactions with p-values < 0.05 in at least one of the two replication cohorts significantly replicated. SNP annotation was performed using HaploReg version 4.1. [14].

Pathway analysis
All p-values for SNP-by-ETS interactions from the identification cohort LifeLines were used in the pathway analysis using the online improved gene set enrichment analysis tool i-GSEA-4-GWAS. [15] All log transformed p-values for SNPs 100 kb upstream and downstream of each gene were used to represent that specific gene. Each gene was represented by the lowest SNP p-value annotated to that gene. These SNP p-values were used to rank the genes, and the proportion of significant genes as a number of the total amount of genes (gene set) belonging to a pathway was calculated. Based on the rank, p-values were calculated for the association between the total gene set/pathway and the outcome. Additionally false discovery rate (FDR) corrected p-values were calculated. Gene sets/pathways with FDR corrected p-values < 0.25 are regarded as suggestively associated with the outcome, whereas FDR p-values < 0.05 are regarded as highly confident for an association with the outcome. [15].

Descriptive statistics
In the LifeLines study, complete data on all covariates was available for 11,187 subjects. 370 subjects (3%) were excluded because they had self-reported ETS exposure between 0 and 1 h per day. Finally, the analysis included 10,817 subjects, of which 2473 subjects (23%) reported at least one hour of ETS exposure per day (Table 1). In the verification cohorts, complete data was available for 1276 subjects in the SAPALDIA study (23% being ETS exposed) and 1156 subjects in the Rotterdam study (19% being ETS exposed) ( Table 1).

SNP-by-ETS exposure interactions
P-values for each SNP-by-ETS exposure interaction in association with the level of FEV 1 in the identification analysis are shown in a Manhattan plot (Fig. 1). A total of 45 SNPs had interaction p-values <10 −4 in the identification cohort LifeLines (Additional file 2: Table S1) and were taken further for verification in the two independent cohorts. Of these 45 identified SNPs, rs4421160 was not available in the SAPALDIA study and two other SNPs, rs13282467 and rs743262, were not available in the Rotterdam Study.
Of the 45 identified SNP-by-ETS interactions, nine had the same direction of effect in all three cohorts. For all nine interactions there was little indication of heterogeneity between the cohorts (I 2 for all interactions < 30) ( Table 2). Two of these nine interactions were nominally significant (rs11950494 and rs2090789, p < 0.05) in one of the two verification cohorts. None of these nine interactions reached genome-wide significance in the metaanalysis of all three cohorts (i.e. Bonferroni-corrected threshold for 227,981 SNPs tested in the initial identification analysis = p < 2.26*10 −7 ) ( Table 2).

Pathway analysis
Of all SNP-by-ETS interactions included in the identification analysis in the LifeLines study, p-values of 165,298 interactions were mapped to 15,243 genes and 231 gene sets/pathways. Pathway analysis showed one significant (FDR p-value < 0.05) and two suggestively enriched pathways (FDR p-value < 0.25) ( Table 3). The most significant, i.e. the apoptosis pathway, includes 71 genes of which 54 were present in the identification analysis and 23 were significantly associated with the outcome ( Table 4). The two suggestively associated pathways were the p38 MAPK pathway (Table 5) and tumor necrosis factor (TNF) pathway (Table 6), with 16 genes and 9 genes from the SNP-by-ETS exposure interaction analysis that were significantly associated with the outcome, respectively.

Discussion
The current study is the first to explore SNP-by-ETS exposure interactions on the level of FEV 1 during adulthood in a hypothesis-free genome-wide manner. We extended our findings to pathway level analysis and showed that several pathways, i.e. the apoptosis, p38 MAPK and TNF pathways, may be underlying susceptibility to impaired FEV 1 in the context of ETS exposure. The SNP with the most significant interaction in the identification cohort was located in the gene coding for KCNH1, also known as ether-à-go-go (EAG1). KCNH1 is a voltage-gated potassium channel that is highly expressed on mast cells and macrophages in germinal centers of reactive lymph nodes, [16] which may indicate its involvement in immune responses. Moreover, both mRNA and protein expression of KCNH1 were upregulated during epithelial-to-mesenchymal transition (EMT) of human lung tumor cells induced by TGFβ1. [17] Increased expression of EMT markers has been observed in the airways of smokers, especially those with COPD. [18] Although one SNP in KCNH1 (rs7526579) had the same direction of interaction effect in all three cohorts, it did not reach genome-wide significance after meta-analysis and did not reach nominal significance in at least one of the verification cohorts. Findings, therefore, remain speculative.
Two SNPs-by-ETS interactions identified in the Life-Lines study had nominally significant p-values in at least one of the verification cohorts, i.e. rs11950494 in SAPALDIA and rs2090789 in the Rotterdam Study. SNP rs11950494 is located in the gene actin beta-like 2 (ACTBL2) and rs2090789 is located in a predicted noncoding RNA LOC100128993. Both genes are expressed in lung tissue (genecards.org), however, at current little is known about their biological function in general or relevance to lung function specifically.
In the current study, we used a large and well documented homogeneous cohort of a general population, i.e. the LifeLines study, to assess SNP-by-ETS exposure interactions. We used a liberal p-value threshold (p < 10 −4 ) in the identification analysis and attempted to verify the SNP-by-ETS exposure interactions in two independent cohorts, the SAPALDIA and Rotterdam studies. Only 2 SNP-by-ETS exposure interactions were replicated with a nominal p-value and with the same direction of effect, which is less than expected based on chance only (i.e. 5% of 45 SNPs = 2.25). Moreover, none of the SNPs reached the Bonferroni-corrected threshold for genomewide significance (p-value = 2.19*10 −7 ). Therefore interpretation of the results remains difficult and the implications of the outcomes uncertain. The replication cohorts were relatively small, which may have limited the power  to significantly replicate our findings. Another reason for not finding significant interaction effects may be the rather crude assessment of ETS exposure. In general, measuring ETS exposure during adulthood is difficult, especially when using self-reports. Thus far, no GWI studies on ETS exposure during adulthood have been published, suggesting that either no studies have been performed, or that publication bias exists due to null findings. There were slight differences in characteristics between the cohorts, i.e. enrichment with asthmatics in SAPALDIA (40% asthmatics) and the older mean age of subjects in the Rotterdam Study. However, sensitivity analysis in the identification cohort suggested that effects estimates for the two marginally replicated SNPs did not change when only non-asthmatics were included. Moreover, SNP-by-ETS interaction effects rather get more than less pronounced in older (≥50 years) compared to younger subjects (<50 years) (data not shown). Interestingly, a final sensitivity analysis showed that associations of these two SNP-by-ETS interactions with FEV 1 , did only remain in subjects without airway obstruction (FEV 1 /FVC ≥ 70%) (data not shown), suggesting that genetic susceptibility to effects of ETS is less important when already having airway obstruction.
In addition to single SNP analysis we performed a pathway analysis based on interaction p-values in the LifeLines study. Compared to single SNP analysis,   pathway analysis may have increased power to detect genetic associations of the phenotype with a gene set/ pathway. [19] Three pathways were significantly or suggestively enriched, i.e. the apoptosis, p38 MAPK and TNF pathways. Interestingly all three pathways may mutually interact and have been previously implicated in the pathogenesis of COPD, a disease caused by an abnormal inflammatory response to noxious particles and gases leading to airflow obstruction. Apoptosis is a programmed form of cell death. Previous investigations within the SAPALDIA study have found suggestive evidence that genetic variation in the apoptosis pathway modifies the effect of pack years smoked on the decline of FEV 1 . [20] An imbalance between apoptosis and proliferation of alveolar epithelial and endothelial cells has been observed in the lungs of patients with COPD. [21] Apoptosis is regulated by various pathways. One of the pathways is a response to extracellular signals by binding of members of the tumor necrosis family, such as TNF-alpha with death receptor TNF-receptor 1. [21] For example, cigarette smoke exposure was shown to increase TNF-alpha expression. [22] This interaction between the different pathways was also reflected by the substantial overlap in genes enriched in the TNF-alpha (Table 6) and apoptosis pathways (Table 4) in the pathway analysis. Another proapoptotic pathway responds to physical and chemical stressors via the release of cytochrome C by mitochondria. Subsequent formation of an apoptosome activates several caspases which eventually initiate apoptosis. Interestingly, we identified an intronic SNP in APAF1 that interacted with ETS exposure (p-value = 5.88*10 −5 ) in the identification cohort LifeLines, the expressed protein of this gene is part of this apoptosome initiating apoptosis (Additional file 1: Tables S1 and Table 4). However, this SNP-by-ETS exposure interaction was not replicated in the SAPALDIA or Rotterdam study.
The TNF pathway was suggestively enriched in the pathway analysis. TNF-alpha is a cytokine playing an important role in inflammation through its activation of several downstream signaling cascades, amongst others the p38 MAPK pathway. Levels of TNF-alpha have been shown to be increased in sputum of COPD patients compared to both non-smoking and smoking controls, and in response to air pollution exposure. [23,24] The second suggestively enriched pathway was the p38 mitogen activated protein kinase (MAPK) pathway, this pathway has also been implicated in the development and/or maintenance of a number of chronic airway inflammatory diseases such as COPD. [25] The p38 MAPK pathway is activated by various environmental stressors, growth factors and cytokines and in turn regulates the expression of inflammatory cytokines such as TNF-alpha and may initiate apoptosis. [26] Increased activation of p38 MAPK was seen in alveolar walls and alveolar macrophages of COPD patients compared to non-smoking and smoking controls. [27]. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Availability of data and materials
The datasets used during the current study are available from the corresponding author on reasonable request.

Authors' contributions
KdJ participated in the study design, analysis and interpretation of the data, and drafting of the manuscript, tables and figures. HMB, JMV and DSP obtained funding, determined the study design. MI, NPH, LL, AH, GGB were involved in the replication analysis. All authors participated in the interpretation of data and approved the final version of the manuscript.
Ethics approval and consent to participate LifeLines The LifeLines cohort study was approved by the Medical Ethics Committee of the University Medical Center Groningen, Groningen, The Netherlands. All subjects gave written informed consent. SAPALDIA. Ethical clearance for the SAPALDIA study was obtained from the Swiss Academy of Medical Sciences, the National Ethics Committee for Clinical research (UREK, Project Approval Number 123/00) and the Ethics Committees of the eight participating communities including Basel, Wald, Davos, Lugano, Montana, Payerne, Aarau and Geneva. Participants provided informed consent for participation in the health interviews, physical examinations, blood marker and genetic assays. The Rotterdam Study The Rotterdam Study was approved by the medical ethics committee of Erasmus University. All subjects provided written informed consent.

Consent for publication
Not applicable.
Competing interests Dirkje S. Postma: The University of Groningen has received money for Professor Postma regarding a grant for research from Astra Zeneca, Chiesi, Genentec, GSK and Roche. Fees for consultancies were given to the University of Groningen by Astra Zeneca, Boehringer Ingelheim, Chiesi, GSK, Takeda and TEVA. All other authors have declared that no competing interests exist.