CiliOPD: a ciliopathy-associated COPD endotype

The pathophysiology of chronic obstructive pulmonary disease (COPD) relies on airway remodelling and inflammation. Alterations of mucociliary clearance are a major hallmark of COPD caused by structural and functional cilia abnormalities. Using transcriptomic databases of whole lung tissues and isolated small airway epithelial cells (SAEC), we comparatively analysed cilia-associated and ciliopathy-associated gene signatures from a set of 495 genes in 7 datasets including 538 non-COPD and 508 COPD patients. This bio-informatics approach unveils yet undescribed cilia and ciliopathy genes associated with COPD including NEK6 and PROM2 that may contribute to the pathology, and suggests a COPD endotype exhibiting ciliopathy features (CiliOPD).


Introduction
Cilia dysfunction is a hallmark of chronic obstructive inflammatory lung diseases [1]. Alterations of both cilia structure and function alter airway mucociliary clearance. Epithelial remodelling is indicted in COPD pathogenesis, including distal to proximal repatterning of the small airways and altered generation of motile and primary ciliated cells [2][3][4].
Cellular processes related to cilia dysfunction such as autophagy [5] may represent therapeutic targets, although the genetic print of cilia involvement in COPD has only been observed in comparative gene expression studies providing COPD-and smoking-associated signatures [6][7][8]. In this study, we thought to investigate further cilia dysregulation in COPD. Rather than focussing on the biological samples to identify the most significant hits across the whole genetic code, we comparatively analysed cilia-associated and ciliopathy-associated gene signatures in 7 datasets including 538 non-COPD and 508 COPD patients.

RNAseq data analysis
Previously published datasets of gene expression of whole lung tissue samples and small airways bronchoscopic samples obtained from non-COPD and COPD patients, that were available online, were collected (GEO database; http://www.ncbi.nlm.nih.gov/geo; accession numbers: GSE47460, GSE57148, GSE76925, GSE103174, GSE11784, GSE37147, GSE56341). The expression of the genes of interest (human cilia-associated genes and ciliopathy-associated genes) was analysed depending on COPD status and expressed as fold-change compared to non-COPD. Associations with clinical/functional characteristics were searched. Analyses were performed separately in the lung compartment ("whole lung tissue sample", GSE47460, GSE57148, GSE76925, GSE103174) and in small airway epithelial cells (SAEC) ("small airways bronchoscopic samples", GSE11784, GSE37147, GSE56341). GSE plots contain all the genes of the genesets for each dataset. The y-axis is the log10 ratio obtained by dividing the mean of each gene's value in lung tissues or SAEC of non-COPD patients by its value in COPD patients. The Venn Diagram were designed with a tool from VIB/ UGent (http://bioin forma tics.psb.ugent .be/webto ols/ Venn).

Single cell sequencing analysis
The published dataset can be found on http://www.lungc ellat las.org. We retained cell clustering based on the original studies and considered only subjects with no respiratory disease [16].

Statistics
Comparisons of transcriptomic data between COPD and non-COPD subjects were performed within each dataset using SPSS v24. Paired t-tests were applied to the log2 transformed transcriptomic data; p < 0.05 was considered significant. False discovery rate (FDR) correction was applied.

Results
A total of 7 public datasets of transcriptomic analysis of either whole-lung tissues (238 non-COPD and 391 COPD patients) or SAEC (300 non-COPD and 117 COPD patients) were analysed to identify an alteration of cilia-and ciliopathy-associated genes expression in COPD patients when compared with non-COPD patients (Fig. 1a, b, Additional file 1: Table S1).
We identified 23 genes deregulated in COPD in at least 2 whole-lung datasets and 1 SAEC dataset (Additional file 1: Figure S3A and B), 12 were associated to ciliopathies (Additional file 1: Figure S3B and C) representing 6% of ciliopathy-associated genes. In addition, 47% of ciliopathy-associated genes (n = 88) were found deregulated in either COPD lung tissues or SAEC.

Discussion
Since a few cilia-associated genes were found enriched in GWAS, impaired ciliary function has been suggested to contribute to the pathogenesis of COPD [25]. Here, we considered the whole spectrum of cilia-associated genes and genes involved in known ciliopathies to compare their expression levels between COPD and non-COPD patients. The novelty of our approach lies in the concept that the alteration of cilia is paramount in COPD pathogenesis and that this organelle and its alterations are directly involved in COPD pathophysiology rather than simple collateral damage. We identified a dysregulation of the expression of 29% of cilia genesets in lung tissues and 16% in SAEC in COPD patients, suggesting that an alteration of cilia structure and/or function is an important feature of COPD. Previous comparative genetic studies on COPD patients identified a few genes associated to cilia. In this study, we revealed the full extent of cilia dysregulated expression and we questioned the genetic print of ciliopathies connected to COPD. Further studies will need to assess the role of dozens of candidate that may be functionally involved in the pathophysiology of COPD.
Motile cilia are located in the airways up to the respiratory bronchioles. Primary cilia are observed on nondifferentiated epithelial cells, fibroblasts, smooth muscle cells, and endothelial cells [1,4]. Thus, an alteration of cilia-associated genes may greatly impact the functions of the main pulmonary cell populations. We identified here two sets of dysregulated cilia-related genes depending on the initial tissue sampling: whole lung tissue including all lung cell populations, and SAEC restricted to the epithelial tissue. The 177 genes deregulated in COPD patients in at least 3 datasets (2 whole lung and 1 SAEC) Fig. 1 Study design. a Flow chart defining the selection of cilia/ciliopathy-associated genes analysed in the 4 whole lung tissues and 3 SAEC datasets. b Table summarizing the available clinical parameters across the 7 datasets in non-COPD (N-COPD) and COPD patients were mainly involved in biological processes associated to ciliary retrograde and anterograde transport, Hedgehog signalling, cilium beat frequency, cilium assembly and microtubule anchoring at centrosome. Since a large quantity of genes orchestrating cilia formation and function were found altered in COPD, we evidence a global cilia dysfunction at the root of the disease rather than a punctual alteration observed as a consequence of pathophysiological mechanisms.
Among the candidate genes, we identified for the first time a list of 24 genes (Table 1) commonly deregulated in the majority of datasets in COPD patients. The impacted biological processes corresponded to the aforementioned ones, suggesting that these genes may represent key actors to understand cilia dysfunction in COPD. Interestingly, 7 genes have been previously highlighted in experimental investigations in the context of COPD [6,[17][18][19][20][21][22][23][24]. Although they were often fortuitously exposed, their associations with COPD were found sufficiently significate to be mentioned. Single-nucleotide polymorphisms (SNP) predicting alteration of lung function were identified for DCDC2 and WDPCP in GWAS [20,22,23]. Transcriptomic studies unveiled an upregulation of NEK6 in COPD patients, a downregulation of BBS9 in large airway epithelial cells (LAEC) of smokers compared to non-smokers but not in SAEC, and a downregulation of WDR34 in SAEC of smokers compared to non-smokers but not in LAEC [6,19]. In vitro and in vivo approaches unveiled differential protein expression and localization for ADAM15 and GLI2 according to cell populations (epithelial cells vs non-epithelial cells or differentiated epithelial cells vs non-differentiated epithelial cells) and sub-cellular localization (cytoplasm vs nuclear) [17,18,21,24]. These findings were generally concordant to the relative gene expression levels we reported across multiple genesets. Further investigations will necessarily require analysis of every components of the molecular print of each candidate in order to recognize its interest in COPD studies including but not limited to: SNP and copy number alterations, gene expression levels, and cellular and sub-cellular protein localizations.
In addition, the expressions of 2 genes were associated with COPD status in all 4 lung databases (NEK6, coding for a serine/threonine-protein kinase involved  EC + + + MCC in cell cycle progression during M phase) and in all 3 SAEC databases (PROM2, coding for a transmembrane glycoprotein involved in intracellular trafficking), both being upregulated in COPD. Using single cell sequencing analysis, we confirmed that all commonly dysregulated genes (14 in lung datasets and 10 in SAEC datasets) were mainly expressed by multiciliated cells (motile cilia) and non-differentiated epithelial cells (primary cilia) ( Table 1). This hypothesis-generating study has limitations: the lack of clinical data available did not allow us to perform any analysis of the contribution of smoking history or associations with COPD severity; our analysis focused on cilia/ciliopathy-associated genes but hundreds more are involved in centriole regulation and could potentially participate to cilia alterations, nonetheless it would be challenging to distinguish their involvement during interphase (centrosome) and quiescence (ciliogenesis). Although experimental validation of the gene candidates that we identified will be needed, our results suggest alteration of cilia-related cellular and tissular processes that might have a role in COPD pathophysiology.
Ciliopathies refer to genetic disorders that are caused by the abnormal formation or function of cilia. Since cilia and cilia-associated genetic signature are abnormal in COPD patients, COPD could be included in the spectrum of ciliopathies and ciliopathy-associated COPD (CiliOPD) may be considered as a COPD endotype.

Additional file 1. Additional figures and tables.
Abbreviations COPD: Chronic obstructive pulmonary disease; SAEC: Small airway epithelial cells; GEO: Gene Expression Omnibus; NEK6: NIMA-related kinase 6; PROM2: Prominin-2; GWAS: Genome wide association study; SNP: Single-nucleotide polymorphism; LAEC: Large airway epithelial cells. Fig. 3 Identification of cilia-associated deregulated genes in COPD SAEC. a Dot plots showing cilia-associated genes signature quantification and significant fold change in COPD SAEC compared with non-COPD SAEC per dataset. b Venn diagram of overlapping genes between the 3 datasets of SAEC regarding the 399 cilia-associated genes. c Truncated violin-plots showing mean and IQR for the expression levels of FOXJ1 and PROM2 for the 3 SAEC datasets. *, p < 0.05; ***, p < 0.001 non-COPD vs COPD