Integrative genomics identifies new genes associated with severe COPD and emphysema

Background Genome-wide association studies have identified several genetic risk loci for severe chronic obstructive pulmonary disease (COPD) and emphysema. However, these studies do not fully explain disease heritability and in most cases, fail to implicate specific genes. Integrative methods that combine gene expression data with GWAS can provide more power in discovering disease-associated genes and give mechanistic insight into regulated genes. Methods We applied a recently described method that imputes gene expression using reference transcriptome data to genome-wide association studies for two phenotypes (severe COPD and quantitative emphysema) and blood and lung tissue gene expression datasets. We further tested the potential causality of individual genes using multi-variant colocalization. Results We identified seven genes significantly associated with severe COPD, and five genes significantly associated with quantitative emphysema in whole blood or lung. We validated results in independent transcriptome databases and confirmed colocalization signals for PSMA4, EGLN2, WNT3, DCBLD1, and LILRA3. Three of these genes were not located within previously reported GWAS loci for either phenotype. We also identified genetically driven pathways, including those related to immune regulation. Conclusions An integrative analysis of GWAS and gene expression identified novel associations with severe COPD and quantitative emphysema, and also suggested disease-associated genes in known COPD susceptibility loci. Trial registration NCT00608764, Registry: ClinicalTrials.gov, Date of Enrollment of First Participant: November 2007, Date Registered: January 28, 2008 (retrospectively registered); NCT00292552, Registry: ClinicalTrials.gov, Date of Enrollment of First Participant: December 2005, Date Registered: February 14, 2006 (retrospectively registered). Electronic supplementary material The online version of this article (10.1186/s12931-018-0744-9) contains supplementary material, which is available to authorized users.


Background
Chronic obstructive pulmonary disease (COPD) is characterized by irreversible airflow obstruction and is strongly influenced by genetic factors [1,2]. Genomewide association studies of COPD and related traits (e.g., emphysema) have revealed multiple genetic loci associated with disease risk [3][4][5]. Most loci identified by genome-wide association studies (GWAS) are regulatory, and do not directly alter the amino acid sequence.
Gene expression is arguably the most impactful and well-studied effect of regulatory genetic variation. GWAS loci are enriched for expression quantitative trait loci (eQTL), rendering it a potential link between genetic variant and biology of disease [6,7]. The efforts of large cohort studies and consortia such as the Genotype-Tissue Expression Project have discovered thousands of genetic variants associated with gene expression in multiple tissues. While most GWAS studies do not concomitantly measure gene expression, the strong relationship of genetic variation to gene expression allows one to use gene expression reference datasets to predict gene expression given a set of genotypes, and subsequently identify gene expression differences for a given phenotype. This approach has been implemented in software called S-PrediXcan and TWAS [8][9][10]. Aggregating information from variant level to infer genelevel associations increases the power to discover more genes at loci not previously implicated by GWAS and gives mechanistic insight regarding genes being regulated via disease-associated genetic variants [7,11].
Despite the convention of naming a discovered locus for the nearest gene (e.g., HHIP), further study is needed to identify the specific gene(s) and variant(s) responsible for disease risk [9,11,12]. In identified COPD susceptibility loci, most loci contain multiple genes, and variants in these genes are correlated (in linkage disequilibrium). More than one gene in a locus may also play a role in disease pathogenesis, as seen in other complex diseases [13,14]. With recently developed methods and a growing amount of gene expression data made publicly available, integrating GWAS with known functional annotations of each variant (e.g., associated with gene expression) could highlight novel and biologically relevant genes for further evaluation.
We hypothesized that application of these integrative methods to specific phenotypes of COPD (severe disease and quantitative emphysema) would facilitate discovery of new gene-disease associations and elucidate the mechanism of gene in existing susceptibility loci. Specifically, we sought to identify genes and pathways genetically up-or down-regulated by phenotype-associated variants in tissue-specific reference datasets using S-PrediXcan and TWAS [3,5], and to assess the potential causality of individual genes using multi-variant colocalization. 1

Genome-wide association studies and meta-analysis
We used genome-wide association summary statistics for two phenotypes based on the same four cohorts. Demographic characteristics of individuals included in analyses of these two phenotypes are summarized in Tables 1 and 2. The four cohorts included individuals enrolled in Genetic Epidemiology of COPD (COPDGene, NCT00608764), Evaluation of COPD Longitudinally to Identify Predictive Surrogate Endpoints (ECLIPSE, SCO104960, NCT00292552), National Emphysema Treatment Trial (NETT) and Normative Aging Study (NAS), and GenKOLS (Genetics of COPD, Norway). Meta-analyses of these two phenotypes were published previously [3,5]. Severe COPD was defined by postspirometric measures of forced expiratory volume in 1st second (FEV 1 ) lower than 50% of predicted value and the ratio of FEV 1 to forced vital capacity (FEV 1 /FVC) less than 0.7, excluding individuals with known severe alpha-1 antitrypsin deficiency. For quantitative emphysema, we produced the histogram of segmented CT chest images and used the percentage low attenuation area at − 950 Hounsfield units (HU) threshold (%LAA-950), and the HU at the 15th percentile of the density histogram (Perc15) for the quantification of emphysema. A summary of our approach is shown in Fig. 1.

Integration of GWAS and gene expression
To integrate our GWAS and gene expression results, we used S-PrediXcan [10]. We included two relevant reference transcriptome databases in our analysis, whole blood from Depression Genes and Networks (DGN-Blood) and lung tissue from Genotype-Tissue Expression consortium (GTEx-Lung). Details on prediction models and datasets used were provided in Additional file 1: Supplementary Methods. The ability of genetic variants to predict the expression of individual genes varies; only genes with significant prediction models were included in the analysis (11,529 genes for DGN-Blood and 6425 genes for GTEx-Lung). We accounted for multiple hypothesis testing using Bonferroni correction to determine statistical significance of gene-disease associations, resulting in p-value of 4.34 × 10 − 6 and 7.78 × 10 − 6 for DGN-Blood and GTEx-Lung, respectively.

Validation in other reference transcriptome databases
To determine whether our imputed gene expression was consistent in other datasets, we tested significant genes from DGN-Blood and GTEx-Lung in two independent reference transcriptome databases, GTEx for whole blood (GTEx-Blood) and the Lung-eQTL Consortium for lung tissue using S-PrediXcan and TWAS/FUSION (Additional file 1: Supplementary Methods). We considered an expression result to be validated if the direction  Trial, Perc15 HU at the 15th percentile of the density histogram of effect was consistent and the Bonferroni-corrected Pvalue < 0.05.

Colocalization analysis using eCAVIAR
Colocalization analysis estimates a posterior probability that a given variant or set of variants is causal for both the phenotype of interest (e.g., COPD) and expression level of a given gene. We used eCAVIAR (eQTL and GWAS Causal Variant Identification in Associated Regions), as it allows for multiple causal variants [15]. Details on parameters and procedures used in the analysis were present in Additional file 1: Supplementary Methods. Genes identified in whole blood were tested for colocalization using eQTL from GTEx-Blood while using GTEx-Lung and Lung-eQTL Consortium for lung tissue. The probability of a variant to be causal for a given gene in both datasets was determined by the colocalization posterior probability (CLPP) that approximates the posterior probability of a variant to be causal in GWAS and posterior probability of a variant to be causal in eQTL [15]. We also obtained functional annotations of colocalized variants in lung relevant cell types (Additional file 1: Supplementary Methods).

Severe COPD
We first examined the association between severe COPD and imputed gene expression. Significant associations based on gene-based Bonferroni corrections for DGN-Blood and GTEx-Lung are shown in Table 3 and Fig. 2.

Emphysema
In whole blood and lung tissue, we identified five genes significantly associated with %LAA-950 and one gene with Perc15 (Table 3; Fig. 2). We found two significant associations of genes at loci previously associated with %LAA-950, PSMA4 in 15q25 and ATF6B in 6p21, the latter which is located near AGER. The top genomewide significant variant at this latter locuswhich lies within the HLA (Human Leukocyte Antigen) regionis a nonsynonymous variant in AGER; however, AGER was not significant in either blood or lung (P = 0.81 and 0.18, respectively). LILRA3, DCBLD1, and ITGA1 are at loci not previously associated with COPD or emphysema.

Validation in other reference transcriptome databases
To provide further evidence for differentially expressed genes associated with severe COPD and emphysema, we repeated our analysis using additional reference transcriptome databases with the same GWAS data. In blood, we validated PSMA4, EGLN2, and RAB4B for severe COPD (P = 3.79 × 10 − 14 , 1.34 × 10 − 5 , and 1.33 × 10 − 4 , respectively), and PSMA4 and LILRA3 for %LAA-950 (P = 3.37 × 10 − 7 and 3.62 × 10 − 5 , respectively) by using GTEx-Blood as a validation for genes identified through whole blood transcriptome analysis (Table 3). We also validated WNT3 for severe COPD (P = 4.27 × 10 − 6 ) and DCBLD1 for %LAA-950 (P = 1.41 × 10 − 4 ) for genes identified from GTEx-Lung using a lung transcriptome database from Lung-eQTL Consortium (Table 3). We also noted that for several genes, a prediction model was not available, likely due to lower power and sample size in the validation dataset for whole blood [9]. Although the association of FAM13A was initially identified using blood dataset, its association was significant using Lung-eQTL Consoritium (Z score = 4.52, P = 6.3 × 10 − 6 ).

Colocalization analysis of validated genes
Gene expression differences identified using S-PrediXcan may be causally associated with the phenotype of interest, but also can be due to linkage disequilibrium (LD) [15]. To determine whether there was evidence of shared causality, we performed colocalization analysis, using a method that allows for multiple causal variants. Of the seven associations, six occupied at least one shared variant (Table 4): PSMA4, EGLN2, and WNT3 (Fig. 3) for severe COPD; PSMA4, LILRA3 (Additional file 1: Figure S1), and DCBLD1 (Additional file 1: Figure S2) for %LAA-950. For associations identified in lung, we additionally confirmed the colocalization signals using the Lung-eQTL consortium dataset (Additional file 1: Table S1). We then sought to leverage functional annotation of shared variants especially for those with high colocalization probability. Some colocalized variants associated with PSMA4, LILRA3, DCBLD1, and WNT3 located in annotated regulatory regions (e.g., rs35061187 is in active transcription start site (TSS) in lung fibroblasts) or predicted to affect transcription factor binding (Additional file 1: Table S1 and S2).

Genetically regulated differential expression of genes in known susceptibility loci
Of the above significantly differentially regulated genes, four are in known susceptibility loci (4q22 and 15q25 with severe COPD, and 6p21 and 15q25 with %LAA-950). We also sought to investigate whether additional known susceptibility loci for severe COPD and quantitative emphysema affect the genetically regulated expression of nearby genes. We investigated nominal association results (P < 0.05) in other nine susceptibility loci in either discovery or validation datasets. Using this criterion, we found 5 additional suggestive associations, namely TGFB2 (1q41), HHIP (4q31), and RIN3 (14q32.12) with severe COPD, and HHIP (4q31) with %LAA-950 and Perc15 (Additional file 1: Table S3). However, we did not find any suggestive signals in 11q22 (MMP12) with severe COPD, 14q32.13 (SERPINA10) with %LAA-950, and 8p22 (DLC1) with %LAA-950 and Perc15.

Pathway enrichment analysis
In contrast to genetic gene set enrichment methods that rely only on the location of the SNP to infer affected genes [17], we used the results of our predicted gene expression to identify pathways by using the top 1% of differentially expressed genes (Table 5, Additional file 1: Supplementary Methods). We identified enrichment of the T cell receptor signaling pathway (corrected P = 6.6 × 10 − 3 ); this pathway included PSMA4 along with genes in the HLA complex. We also found significant enrichment for proteasome core complex genes (corrected P = 2.82 × 10 − 2 ) which included PSMF1, PSMB4, and PSMB9. An additional pathway of interest was cellmatrix adhesion of collagen binding (corrected P = 2.74 × 10 − 3 ) ( Table 5). We also found enrichment of the asthma pathway using the KEGG database (corrected P = 4.80 × 10 − 3 ), containing MS4A2 and genes in HLA.

Discussion
Genome-wide association studies have arguably become the mainstay of identifying genetic risk factors for complex disease. However, these studies cannot identify which gene(s) in the region is responsible for the association, and testing all variants individually and independently is likely suboptimal. Here, we used an integrative method that combines the genetic component of gene expression with genetic association analysis in severe COPD and quantitative emphysema to predict differentially expressed genes. Importantly, this method focuses on the association of genetic component of gene expression, not gene expression as a whole, as is typical in most gene expression studies. We also provided additional support of our results by examining results in a second gene expression dataset, and performing colocalization analysis that attempts to identify whether association signals for gene expression and a phenotype of interest appear to be driven by the same causal variant(s). We implicated genes that are genetically regulated in known COPD-susceptibility loci, such as FAM13A, and also found genes in regions that were not previously reported: WNT3 for severe COPD, and DCBLD1 and LILRA3 for quantitative emphysema. We found a novel association of WNT3 in lung tissue with severe COPD in two gene expression datasets. Although variants surrounding this gene in the 17q21 locus were not genome-wide significant in our COPD analysis GWAS (Fig. 3), the top signal (rs9912530) is in strong LD with variants previously reported in GWAS of FEV 1 [18,19], interstitial lung disease [20], and idiopathic pulmonary fibrosis [21] (r 2 with these previously described variants, 0.55-0.72). WNT3 (Wnt family member 3) encodes Wnt3, a critical component of the Wnt-beta-catenin-TCF signaling pathway [22] and a required signal for the apical ectodermal ridge in limb patterning [23]. Deficient WNT3 is associated with tetraamelia syndrome, a Mendelian disease characterized by an absence of all limbs. The top signal is also in strong LD with variants associated with various complex diseases such as Parkinson's disease and celiac disease (r 2 0.72-0.79) [24,25]. Previous expression studies of small airway epithelium found that this gene, along with its Wnt signaling companions, was down-regulated in smokers compared with nonsmokers [26]. Of interest, FAM13A, a well-supported COPD susceptibility gene, has been involved in the Beta-catenin/Wnt signaling pathway by protein degradation [27]. While there is substantial interest in Wnt signaling in lung disease [28], the contribution of WNT3 to the pathogenesis of COPD requires further investigation. To address whether these findings were specific for severe COPD, we repeated the analysis including moderate disease (GOLD 2). All of our genes were at least nominally significant, though overall the significance of our findings was attenuated (Additional file 1: Table S4).
For emphysema, we identified novel associations of LILRA3 and DCBLD1 using whole blood and lung tissue, respectively, and validated these findings in additional gene expression datasets. LILRA3 (leukocyte immunoglobulin like receptor A3) is a gene encoding a soluble receptor for class I major histocompatibility complex (MHC) antigens expressed in monocytes and B cells, which is located in the 19q13 locus. Our top hit from GWAS in this locus, was not genome-wide significant (rs384116 with P = 1.88 × 10 − 5 ; Additional file 1: Figure S1), and 13-Mb away from the previously reported locus [16] that contains EGLN2 and RAB4B (rs7937; r 2 0.002). It is in modest LD with variants suggestively associated with FEV 1 /FVC [18] (r 2 0.44), in strong LD with variants genome-wide significantly associated with HDL-C level [29] and prostate cancer [30] (r 2 0.92-0.99). Blood may be the most relevant tissue for this gene, as it is preferentially expressed [31] with a high estimate of heritability of gene expression in whole blood [32]. However, it may also have an effect in other tissues, given its broad eQTL effects identified by multi-tissue eQTL analysis [33]. This was supported by the suggestive signals of this gene using lung tissue in S-PrediXcan analysis (P = 7.71 × 10 − 5 in GTEx-Lung and 1.38 × 10 − 4 in the Lung-eQTL Consortium with the same direction of effect). Nonetheless, its functional role in COPD has not been described previously. Our other novel association identified in lung tissue, DCBLD1 (discoidin, CUB and LCCL domain containing 1), located in the 6q22 locus, is an integral component of cell membranes and binds to oligosaccharides [34]. GWAS signals in this locus are also sub-genome wide significant (Additional file 1: Figure S2). Our top GWAS variant at this locus was in LD with variants associated with lung cancer [35] (r 2 0.54).
In addition to novel associations, our study also provides insight into disease-associated genes in known COPD susceptibility loci. We identified six genes (FAM13A, GPRIN3, HYKK, PSMA4, EGLN2, and RAB4B) in three known COPD-susceptibility loci for which their genetic component of gene expression in blood or in lung tissue is associated with severe COPD. Five of these six genes are not the most proximal to the top associated SNP, a phenomenon previously observed in other genetic association studies [36,37]. These findings underscore the complexity of genetic regulation in tissues and also identify multiple potential effector genes in the same locus. For example, in 15q25, PSMA4, and not CHRNA3 (the nearest gene to the top GWAS hit) was highlighted in S-PrediXcan and colocalization analysis. Although a role for IREB2 has been clearly demonstrated [38], our study suggested that other genes in the locus, particularly PSMA4a gene encoded for subunit of proteasome complex that acts in the proteolytic pathway [39], may also be of biologic importance.
At the 4q22 locus, an association for FAM13A identified using DGN-Blood was not validated in the GTEx-blood dataset. However, a significant but directionally opposite association was identified in the Lung-eQTL consortium dataset. To further explore this phenomenon, we examined individual SNP eQTL data from the Framingham Heart Study (FHS) blood, and the lung tissue from the Lung eQTL consortium (Additional file 1: Supplementary Methods). We confirmed that SNPs have opposite directions of effect in lung and blood (Additional file 1: Figure S3 and S4). This finding is consistent with prior reports describing significant and opposite tissue specific effects of eQTLs [33,40,41]. The interpretation of this phenomenon is not clear, but may be a result of pleiotropic effects of FAM13A [42,43]. Of note, a recent analysis of emphysema-related gene expression in blood and lung tissue [44] found that the expression of genes in two tissues are often opposite; together, our findings highlight the tissue-specific genetic regulation of genes in COPD susceptibility loci. At the 19q13 locus, while both EGLN2 and RAB4B were successfully validated, only GWAS and eQTL signals for EGLN2 colocalized. This genetic locus was associated with COPD [16] and smoking behavior [45]. Although the causal gene(s) in this region is unclear, methylation and expression studies support the role of EGLN2 in this region [46]. EGLN2 (egl-9 family hypoxia inducible factor 2) encodes an enzyme that regulate the degradation of alpha subunit of hypoxia inducible factor (HIF) [47]. Gene and protein expression of HIF-1α is reduced in lung tissue samples from COPD patients [48].
Although ATF6B (activating transcription factor 6 beta) and ITGA1 (integrin subunit alpha 1) were not successfully validated, we cannot rule out the possibility of false negatives due to differences between the transcriptome datasets used for validation, and they are potentially interesting candidates for COPD. ATF6B was implicated in the unfolded protein response (UPR) pathway during endoplasmic reticulum (ER) stress following cigarette smoke, and may contribute to lung inflammation in patients with COPD [49], while integrins were found to be involved in COPD through the mitogenactivated protein kinase (MAPK) pathway [50,51]. This region also harbors variants associated with FEV 1 /FVC [52]. Decreased expression of ITGA1 was observed in the small airways of patients with low FEV 1 [53].
Our analysis assesses only the genetic component of gene expression. We also investigated whether these genes were differentially expressed in COPD patients, in 464 blood samples from the COPDGene study [54], and 151 lung tissue samples [55] (Additional file 1: Supplementary Methods and Table S5-S8). These genes were not differentially expressed, with the exception of LILRA3, which was nominally significant with %LAA-950 (P = 0.03). Given that the genetic component of gene expression was replicated, we believe that the genetic findings are robust, and speculate that these null findings could be due to nongenetic (i.e. environmental) perturbations that may occur downstream, or as a result of the genetic effects. In fact, in several cases measurements of mRNA or protein are actually opposite those predicted by genetic risk. For example, SERPINA1 risk alleles result in decreased levels and increased risk for COPD, yet average, alpha-1 levels in patients with COPD are actually elevated. Similarly, genetic variants in AGER and DSP affect transcript or protein levels opposite than what is measured in disease [4,56,57]. The mechanisms underlying our genetic findings, as well as AGER and DSP, that result in null or opposite direction effects requires further experimental investigation.
In addition to examination of individual loci, we applied pathway enrichment analysis to nominally significant differentially expressed genes in severe COPD and quantitative emphysema both in whole blood and lung tissue. This analysis identified enrichment of the T cell receptor signaling pathway in emphysema. This finding is consistent with reports that found antigen-specific T cell differentiation in lungs of patients with severe emphysema [58]. Our analysis using gProfileR does not assess of direction of effect, and the relative up-or down-regulation of specific genes in this pathway makes determination of direction difficult. To attempt to infer direction, we used Gene Set Enrichment Analysis (GSEA; [59]). In these results, the TCR signaling pathway and downstream TCR response were up-regulated, though these results were not statistically significant (Additional file 1: Table S9). Further study will be needed to determine the combined effects of COPD genetic susceptibility variants on T cell function and whether these explain some immune dysfunction seen in COPD [60,61]. The finding of the enrichment of genes in the proteasome core complex further suggested a role of proteasome in COPD as described previously. Somewhat surprisingly, we observed enrichment of the asthma pathway in KEGG using genes identified in quantitative emphysema. This finding complements the description of substantial genetic correlation of COPD and asthma [4], and the presence of quantitative emphysema (or lung hyperinflation) in asthmatic patients [62].
Our study did not identify associations of genetically regulated differential expression of genes at some previously reported GWAS loci. Moreover, some of our identified associations in our discovery dataset were not successfully validated in a second transcriptome dataset. These findings indicate some of the limitations of our approach. First, as S-PrediXcan uses cis genetic variants as predictors for gene expression, variants that have lesser or no effect on transcript abundance or act in trans would not be detected by this approach [63]. Second, although most genetic variants implicated by GWAS are likely regulatory, only a minority of genetic loci are explained by existing eQTLs [64]. This may be due to lack of data in the appropriate tissue, cell type, or biologic conditions; or the heterogeneity of gene expression studies of bulk tissue. We may overcome these issues as more gene expression datasets and newer techniques such as single-cell gene expression profiling [65] become widely available. Moreover, issues such as cell type composition, sample collection methods, disease status, and differences in analytic methods also made the overlapping analysis challenging. Third, the number of genes available for an analysis depends on the power and sample size of the expression data used in constructing a gene expression prediction model [8,9]. Given the noisy and condition-specific nature of gene expression datasets, variants with small effects on gene expression may be undetectable at the sample sizes available. Additionally, the difference in sample size among transcriptome databases decreases our power to validate or discover more genes.
However, despite technical and population differences, most cis-eQTLs appear to be consistent between studies [66]. Therefore, despite in some cases a modest value of overall coefficient of correlation between predicted and measured gene expression, associations of the genetic component of gene expression as inferred by imputed gene expression have been successfully in identifying disease-associated genes that complement existing methods.

Conclusions
In conclusion, we found that genetic determinants of gene expression were associated with severe COPD and quantitative emphysema phenotypes, identifying genes at known loci, and identifying novel COPD-associated genes. These findings were obtained by integrating GWAS results with gene expression data, performing colocalization analysis, and validating key results in independent gene expression datasets. These findings may provide mechanistic insights into the genetics of COPD.  Table S1. Colocalization probability and regulatory annotations for colocalized variants using corresponding GWAS and Lung-eQTL consortium datasets (see Excel file). Table S2. Colocalization probability and regulatory annotations for colocalized variants using corresponding GWAS and GTEx eQTL (blood and lung) datasets (see Excel file). Table S3. Additional imputed gene expression association results at previously described GWAS significant loci. Table S4. Result of association analysis between imputed gene expression and moderate to severe COPD of reported genes from severe COPD. Table S5. Differential expression analysis between COPD and controls using blood RNA-seq in COPDGene. Table S6. Differential expression analysis of quantitative emphysema using blood RNA-seq in COPD-Gene. Table S7. Differential expression analysis between severe COPD and controls using lung tissues. Table S8. Differential expression analysis of %LAA-950 and Perc15 using lung tissues. Table S9. T-cell-associated gene sets from Reactome using a ranked gene list from associations of %LAA-950 (DGN-Blood). Table S10. Covariate adjustments for differential expression analysis. Figure S1. Regional association plots within 50kb of LILRA3. GWAS of %LAA-950 and blood eQTL were shown in upper panel. Chromatin states and epigenomic marks of normal human lung fibroblast were shown in lower panel (see Supplementary Methods). Figure S2. Regional association plots within 50kb of DCBLD1. GWAS of %LAA-950 and lung eQTL were shown in upper panel. Chromatin states and epigenomic marks of normal human lung fibroblast were shown in lower panel (see Supplementary Methods). Figure S3. Scatter plot of effect size of significant SNPs from eQTL studies of blood and lung tissue for FAM13A. Figure S4. Contribution of each FAM13A SNP in prediction models to overall association statistics.