Skip to main content


No convincing association between genetic markers and respiratory symptoms: results of a GWA study

Article metrics



Respiratory symptoms are associated with accelerated lung function decline, and increased hospitalization and mortality rates in the general population. Although several environmental risk factors for respiratory symptoms are known, knowledge on genetic risk factors is lacking. We aim to identify genetic variants associated with respiratory symptoms by genome-wide association (GWA) analyses.


We conducted the first GWA study on cough, dyspnea and phlegm among 7,976 participants in the LifeLines I cohort and used the LifeLines II cohort (n = 5,260) and the Vlagtwedde-Vlaardingen cohort (n = 1,529) for replication.


We identified 50 SNPs that were assessed for replication. Rs16918212, located in the alpha-2-macroglobulin pseudogene 1 (A2MP1), was associated with cough in both the identification (odds ratio (OR) = 0.72, p = 5.41 × 10−5) and the meta-analyzed replication cohorts (OR = 0.83, p = 0.033). No other significant replicated associations were found.


Given that only 1 out of 50 SNPs showed significant replication (i.e. 2%) we conclude that we did not find a convincing association between genetic markers and respiratory symptoms. Since, environmental exposures are important risk factors for respiratory symptoms, the next step is to perform a genome-wide interaction (GWI) study to identify genetic susceptibility loci for respiratory symptoms in interaction with known harmful environmental exposures.


The presence of respiratory symptoms, such as chronic cough, dyspnea and phlegm, is associated with lower lung function [1, 2] and with mortality due to several causes of death [35]. Respiratory symptoms have been regarded as important markers of accelerated lung function decline [6, 7] and development of asthma [8].

It is known that cigarette smoking [9], allergy [10, 11], air pollution [12, 13] and occupational exposures [14, 15] are risk factors for respiratory symptoms. However, not all exposed subjects develop respiratory symptoms, which suggests that a genetic component may be involved in the development of respiratory symptoms. Previous studies reported associations between respiratory symptoms and specific genetic loci using candidate gene studies [16, 17]. To date, only one genome-wide association (GWA) study has investigated genetic susceptibility of respiratory symptoms (i.e. Chronic mucus hyper-secretion) [18]. Genetic susceptibility to develop respiratory symptoms such as cough, dyspnea, and phlegm has not been studied up until now using GWA methods.

In the current study, we conducted several GWA analyses, i.e. on cough, dyspnea and phlegm, in 7,976 Caucasians of Dutch descent from the large population-based LifeLines I cohort study to identify common genetic variants associated with respiratory symptoms. We used the LifeLines II cohort and the Vlagtwedde-Vlaardingen cohort to replicate our initial findings.


Identification cohort

Genotyped individuals from the first data release of the LifeLines cohort study (2006–2011, LifeLines I) with full data on all covariates were included (n = 7,976). The LifeLines cohort study is a prospective population-based cohort studying health and health-related behavior of subjects from the three Northern provinces of the Netherlands [19, 20].

Replication cohorts

We included 5,260 subjects from the second data release from the LifeLines cohort study (2006–2011, LifeLines II) and 1,529 subjects from the last survey (1989/1990) from the Vlagtwedde-Vlaardingen cohort [21, 22], a prospective general population based cohort including Caucasians of Dutch descent, to replicate our initial findings.

Ethics, consent and permissions

Participants provided written informed consent. The study was approved by the Medical Ethics Committee of the University Medical Center Groningen, Groningen, The Netherlands (ref. METc 2007/152).

Genotyping and quality control

Genome-wide genotyping was performed in the identification and replication cohorts using IlluminaCytoSNP-12 arrays. The IlluminaCytoSNP-12 is an oligonucleotide chip designed to have a uniform spacing of markers across all chromosomes, with the majority of the markers on this chip reflecting common SNPs: 93% of the 301,232 markers on this chip reflect bi-allelic SNP markers. The applied genotyping quality control criteria in the LifeLines cohort and the Vlagtwedde-Vlaardingen cohort have been described before [19, 20]: Samples with call-rates of less than 95% were excluded as were samples of non-Caucasians and first degree relatives. SNPs were excluded if they had a genotype call-rate < 95%, minor allele frequency (MAF) < 1%, or a Hardy-Weinberg equilibrium (HWE) p-value < 10−4. In the LifeLines cohort 227,981 SNPs were included and in the Vlagtwedde-Vlaardingen cohort 242,926 SNPs were included.

Respiratory symptoms

Cough, dyspnea, and phlegm were defined by standardized questionnaires from the European Community Respiratory Health Survey (ECRHS) [23]. Cough was defined as at least one positive answer to the questions: “do you usually cough first thing in the morning in the winter?” or “do you usually cough during the day, or at night, in winter?”. Dyspnea was defined as a positive answer to the question: “are you troubled by shortness of breath when hurrying on level ground or walking up a slight hill or stairs at normal pace?”. Phlegm was defined as at least one positive answer to the questions: “do you usually bring up any phlegm from your chest first thing in the morning in winter?” or “do you usually bring up any phlegm from your chest during the day, or at night, in winter?”.

Statistical analysis

The data are presented as median (min-max) for continuous variables and as frequencies (percentages) for categorical variables. The GWA analyses on the presence of the respiratory symptoms cough, dyspnea, and phlegm were performed using PLINK version 1.07 [24]. We used an additive genetic model adjusted for age, sex, and current smoking. SNPs with a p-value < 10−4 in the identification analysis were taken forward for replication. Replication analysis was performed by analyzing the two replication cohorts separately using logistic regression model in PLINK version 1.07 [24] and subsequently meta-analyzing effect estimates from both cohorts. Significant replication was defined as a fixed effect meta-analysis p-value < 0.05 and an effect estimate in the same direction as in the identification GWA study. SNP annotation was performed using HaploReg version 4 (Broad Institute).


Demographic characteristics and the prevalence of respiratory symptoms in the study cohorts are summarized in Table 1. In the identification cohort LifeLines I, the median age of subjects was 47 years old, 43% were male, and 24% were current smokers. The replication cohorts were comparable with the identification cohort with respect to demographic characteristics. The prevalence of respiratory symptoms in the LifeLines cohorts and Vlagtwedde-Vlaardingen cohort varied from 10 to 22%.

Table 1 Characteristics of the subjects included in the identification (LifeLines I) and replication (LifeLines II and Vlagtwedde-Vlaardingen) cohorts

The Manhattan plots of the GWAS of cough, dyspnea and phlegm are shown in Additional file 1: Figures S1, S2 and S3 respectively. A total of 17 SNPs, 19 SNPs and 14 SNPs were identified for cough (Table 2), dyspnea (Table 3) and phlegm (Table 4) in the identification analyses in LifeLines I, respectively, and taken forward for replication in LifeLines II and Vlagtwedde-Vlaardingen. Rs16918212 (OR = 0.72, p = 5.41 × 10−5 in identification; OR = 0.83, p = 0.033 in replication), located on A2MP1, was significantly associated with cough in the replication cohorts with the same direction of effect as in the identification cohort (Table 2). The replication analyses on dyspnea and phlegm showed no significant replication (Table 3 and Table 4).

Table 2 Top SNPs (n = 17) associated with cough in the GWA study (all P < 1.0 × 10−4)
Table 3 Top SNPs (n = 19) associated with dyspnea in the GWA study (all P < 1.0 × 10−4)
Table 4 Top SNPs (n = 14) associated with phlegm in the GWA study (all P < 1.0 × 10−4)

In addition, we performed GWA analyses on chronic cough and phlegm (both defined as cough or phlegm for at least 3 months per year) and found no significant replication in these analyses either (Additional file 1: Tables S1 and S2).


To the best of our knowledge, this is the first GWA study assessing genetic variants associated with cough, dyspnea, and phlegm. In the identification cohort, we identified 17, 19 and 14 SNPs associated with cough, dyspnea and phlegm respectively at a p < 10−4 significance level. In the meta-analysis of two independent replication cohorts, one association was observed between cough and rs16918212 located on chromosome 12 in intron of A2MP1, and no associations with dyspnea and phlegm were replicated.

The odds ratio for this SNP indicates that carriers of the A allele have a lower risk to cough than subjects with the wild type genotype. This SNP is located in an intron of A2MP1 (alpha-2-macroglobulin pseudogene 1). A2MP1 has been associated with Alzheimer’s disease [25]. Pseudogenes are genomic DNA sequences similar to normal genes but non-functional; they have lost their gene expression in the cell or their ability to code protein [26]. Some pseudogenes can be functional when they are transcribed. Increasing evidence suggests that pseudogenes may have important physiological functions [26].

A major strength of this study the fact this is the first GWA study trying to identify genetic susceptibility loci for cough, dyspnea and phlegm, which included 2 verification samples: one using the same methodology (LifeLines II) and one using similar methodology (Vlagtwedde-Vlaardingen) as the discovery sample (LifeLines I). The respiratory symptoms that we studied were defined based on the standardized questionnaire of the ECRHS.

A GWA study has the advantage of being hypothesis-free. This means that it has the potential of finding new genes underlying disease phenotypes [27]. However, GWA studies also have some disadvantages such as the need of a large study sample, the need for replication, the inability to address causation, and the inability to investigate rare genetic variants [27].

A limitation of our study might be the fact that we used a liberal p-value threshold (p < 10−4) for identification of SNPs in the identification cohort to keep the risk of not detecting a true association between genetic markers and respiratory symptoms low. However, when we assessed these associations in the replication cohorts, the total number of significant associations in the replication meta-analysis is less than expected by chance (i.e. 1 out of the 50 SNPs analyzed for replication (i.e. 2%) had a p-value < 0.05 and the same direction of effect as in the identification analysis). In addition, given that rs16918212 and the A2MP1 gene have not been associated with lung function impairment or respiratory diseases we think the association is likely not a true finding. We therefore conclude that there was no convincing association between genetic markers and respiratory symptoms in this study.

The lack of finding a plausible significant association between SNPs and respiratory symptoms can possibly be explained by the fact that a respiratory symptom can be caused by different environmental exposures or can be a presentation of different underlying diseases with specific genetic or environmental origins. For example, cough, can be triggered by smoking, air pollution and occupational exposures. Susceptibility to these various exposures may be genetically determined and susceptibility loci may differ between exposures. In addition, cough is a common symptom of several chronic respiratory conditions such as asthma, chronic obstructive pulmonary disease (COPD), and lung cancer [28], but cough is also present in non-respiratory conditions such as heart failure [29]. Dyspnea is a common symptom not only in patients with lung and heart diseases, but it is also fairly prevalent among elderly individuals without apparent pre-existing disease [5].


We did not find a convincing association between genetic markers and the presence of respiratory symptoms cough, dyspnea and phlegm. This lack of association between genetic variants and respiratory symptoms may possibly be due to the fact that we did not take the effect of environmental exposures that give rise to respiratory symptoms into account. Therefore, the next logical step will be performing a genome-wide interaction (GWI) study to identify genetic loci for respiratory symptoms in interaction with known harmful environmental exposures.



Single nucleotide polymorphism GWAS, Genome wide association study


  1. 1.

    Jakeways N, McKeever T, Lewis SA, Weiss ST, Britton J. Relationship between FEV1 reduction and respiratory symptoms in the general population. Eur Respir J. 2003;21:658–63.

  2. 2.

    Sunyer J, Basagana X, Roca J, Urrutia I, Jaen A, Anto JM, Burney P. Relations between respiratory symptoms and spirometric values in young adults: the European community respiratory health study. Respir Med. 2004;98:1025–33.

  3. 3.

    Frostad A, Soyseth V, Andersen A, Gulsvik A. Respiratory symptoms as predictors of all-cause mortality in an urban community: a 30-year follow-up. J Intern Med. 2006;259:520–9.

  4. 4.

    Leivseth L, Nilsen TI, Mai XM, Johnsen R, Langhammer A. Lung function and respiratory symptoms in association with mortality: The HUNT Study. COPD. 2014;11:59–80.

  5. 5.

    Figarska SM, Boezen HM, Vonk JM. Dyspnea severity, changes in dyspnea status and mortality in the general population: the Vlagtwedde/Vlaardingen study. Eur J Epidemiol. 2012;27:867–76.

  6. 6.

    Vestbo J, Prescott E, Lange P. Association of chronic mucus hypersecretion with FEV1 decline and chronic obstructive pulmonary disease morbidity. Copenhagen City Heart Study Group. Am J Respir Crit Care Med. 1996;153:1530–5.

  7. 7.

    Sherman CB, Xu X, Speizer FE, Ferris BJ, Weiss ST, Dockery DW. Longitudinal lung function decline in subjects with respiratory symptoms. Am Rev Respir Dis. 1992;146:855–9.

  8. 8.

    Sistek D, Wickens K, Amstrong R, D’Souza W, Town I, Crane J. Predictive value of respiratory symptoms and bronchial hyperresponsiveness to diagnose asthma in New Zealand. Respir Med. 2006;100:2107–11.

  9. 9.

    Jaakkola MS, Jaakkola JJ, Becklake MR, Ernst P. Effect of passive smoking on the development of respiratory symptoms in young adults: an 8-year longitudinal study. J Clin Epidemiol. 1996;49:581–6.

  10. 10.

    Elberling J, Linneberg A, Mosbech H, Dirksen A, Menne T, Nielsen NH, Madsen F, Frolund L, Johansen JD. Airborne chemicals cause respiratory symptoms in individuals with contact allergy. Contact Dermatitis. 2005;52:65–72.

  11. 11.

    Smit LA, Heederik D, Doekes G, Blom C, van Zweden I, Wouters IM. Exposure-response analysis of allergy and respiratory symptoms in endotoxin-exposed adults. Eur Respir J. 2008;31:1241–8.

  12. 12.

    Stern G, Latzin P, Roosli M, Fuchs O, Proietti E, Kuehni C, Frey U. A prospective study of the impact of air pollution on respiratory symptoms and infections in infants. Am J Respir Crit Care Med. 2013;187:1341–8.

  13. 13.

    Li S, Williams G, Jalaludin B, Baker P. Panel studies of air pollution on children’s lung function and respiratory symptoms: a literature review. J Asthma. 2012;49:895–910.

  14. 14.

    Zhang LX, Enarson DA, He GX, Li B, Chan-Yeung M. Occupational and environmental risk factors for respiratory symptoms in rural Beijing. China Eur Respir J. 2002;20:1525–31.

  15. 15.

    Marchetti N, Garshick E, Kinney GL, McKenzie A, Stinson D, Lutz SM, Lynch DA, Criner GJ, Silverman EK, Crapo JD. Association between occupational exposure and lung function, respiratory symptoms, and high-resolution computed tomography imaging in COPDGene. Am J Respir Crit Care Med. 2014;190:756–62.

  16. 16.

    Chang AB, Gibson PG, Ardill J, McGarvey LP. Calcitonin gene-related peptide relates to cough sensitivity in children with chronic cough. Eur Respir J. 2007;30:66–72.

  17. 17.

    Cho HJ, Kim SH, Kim JH, Choi H, Son JK, Hur GY, Park HS. Effect of Toll-like receptor 4 gene polymorphisms on work-related respiratory symptoms and sensitization to wheat flour in bakery workers. Ann Allergy Asthma Immunol. 2011;107:57–64.

  18. 18.

    Dijkstra AE, Boezen HM, van den Berge M, Vonk JM, Hiemstra PS, Barr RG, Burkart KM, Manichaikul A, Pottinger TD, Silverman EK, et al. Dissecting the genetics of chronic mucus hypersecretion in smokers with and without COPD. Eur Respir J. 2015;45:60–75.

  19. 19.

    de Jong K, Vonk JM, Timens W, Bosse Y, Sin DD, Hao K, Kromhout H, Vermeulen R, Postma DS, Boezen HM. Genome-wide interaction study of gene-by-occupational exposure and effects on FEV1 levels. J Allergy Clin Immunol. 2015;136:1664–1672.e14.

  20. 20.

    Scholtens S, Smidt N, Swertz MA, Bakker SJ, Dotinga A, Vonk JM, van Dijk F, van Zon SK, Wijmenga C, Wolffenbuttel BH, et al. Cohort Profile: LifeLines, a three-generation cohort study and biobank. Int J Epidemiol. 2015;44:1172–80.

  21. 21.

    de Jong K, Boezen HM, Kromhout H, Vermeulen R, Postma DS, Vonk JM. Association of occupational pesticide exposure with accelerated longitudinal decline in lung function. Am J Epidemiol. 2014;179:1323–30.

  22. 22.

    van Diemen CC, Postma DS, Vonk JM, Bruinenberg M, Schouten JP, Boezen HM. A disintegrin and metalloprotease 33 polymorphisms and lung function decline in the general population. Am J Respir Crit Care Med. 2005;172:329–33.

  23. 23.

    Burney PG, Luczynska C, Chinn S, Jarvis D. The European Community Respiratory Health Survey. Eur Respir J. 1994;75:954–60.

  24. 24.

    Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, Maller J, Sklar P, de Bakker PI, Daly MJ, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81:559–75.

  25. 25.

    Blacker D, Wilcox MA, Laird NM, Rodes L, Horvath SM, Go RC, Perry R, Watson BJ, Bassett SS, McInnis MG, et al. Alpha-2 macroglobulin is genetically associated with Alzheimer disease. Nat Genet. 1998;19:357–60.

  26. 26.

    Chan WL, Chang JG. Pseudogene-derived endogenous siRNAs and their function. Methods Mol Biol. 2014;1167:227–39.

  27. 27.

    Boezen HM. Genome-wide association studies: what do they teach us about asthma and chronic obstructive pulmonary disease? Proc Am Thorac Soc. 2009;6:701–3.

  28. 28.

    Goldsobel AB, Chipps BE. Cough in the pediatric population. J Pediatr. 2010;1563:352–8.

  29. 29.

    Pavord ID, Chung KF. Management of chronic cough. Lancet. 2008;371:1375–84.

Download references


Not applicable.


This study was funded by the Groningen Research Institute for Drug Exploration (GUIDE), University Medical Center Groningen, University of Groningen, the Netherlands. The LifeLines Cohort Study, and generation and management of GWAS genotype data for the LifeLines Cohort Study is supported by the Netherlands Organization of Scientific Research NWO (grant 175.010.2007.006), the Economic Structure Enhancing Fund (FES) of the Dutch government, the Ministry of Economic Affairs, the Ministry of Education, Culture and Science, the Ministry for Health, Welfare and Sports, the Northern Netherlands Collaboration of Provinces (SNN), the Province of Groningen, University Medical Center Groningen, the University of Groningen, Dutch Kidney Foundation and Dutch Diabetes Research Foundation. The Vlagtwedde-Vlaardingen cohort study was supported by the Ministry of Health and Environmental Hygiene of the Netherlands and the Netherlands Asthma Fund (grant 187) and the Netherlands Asthma Fund grant no., the Stichting Astma Bestrijding, BBMRI-NL (Complementiation project), and the European Respiratory Society COPD research award 2011 to H.M. Boezen.

The sponsors had no role in the study design, data collection, analysis and interpretation, or in writing and submitting the manuscript.

Availability of data and materials

Please contact author for data requests.

Authors’ contributions

XZ and KdJ performed the statistical analyses and drafted the first version of the manuscript. HMB designed the study. HMB and JMV supervised the statistical analyses. JMV, KdJ, XX, XH and HMB interpreted the results and have written the manuscript. All authors read and approved the final manuscript.

Competing interests

The authors declare that they have no competing interests.

Consent for publication

Not applicable.

Ethics approval and consent to participate

Participants provided written informed consent. The study was approved by the Medical Ethics Committee of the University Medical Center Groningen, Groningen, The Netherlands (ref. METc 2007/152).

Author information

Correspondence to H. Marike Boezen.

Additional file

Additional file 1:

Manhattan plots and additional analyses on chronic cough and phlegm. (DOCX 584 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark


  • GWAS
  • Genetic
  • Respiratory symptoms
  • General population
  • cohorts