Nasal gene expression differentiates COPD from controls and overlaps bronchial gene expression

Background Nasal gene expression profiling is a promising method to characterize COPD non-invasively. We aimed to identify a nasal gene expression profile to distinguish COPD patients from healthy controls. We investigated whether this COPD-associated gene expression profile in nasal epithelium is comparable with the profile observed in bronchial epithelium. Methods Genome wide gene expression analysis was performed on nasal epithelial brushes of 31 severe COPD patients and 22 controls, all current smokers, using Affymetrix Human Gene 1.0 ST Arrays. We repeated the gene expression analysis on bronchial epithelial brushes in 2 independent cohorts of mild-to-moderate COPD patients and controls. Results In nasal epithelium, 135 genes were significantly differentially expressed between severe COPD patients and controls, 21 being up- and 114 downregulated in COPD (false discovery rate < 0.01). Gene Set Enrichment Analysis (GSEA) showed significant concordant enrichment of COPD-associated nasal and bronchial gene expression in both independent cohorts (FDRGSEA < 0.001). Conclusion We identified a nasal gene expression profile that differentiates severe COPD patients from controls. Of interest, part of the nasal gene expression changes in COPD mimics differentially expressed genes in the bronchus. These findings indicate that nasal gene expression profiling is potentially useful as a non-invasive biomarker in COPD. Trial registration ClinicalTrials.gov registration number NCT01351792 (registration date May 10, 2011), ClinicalTrials.gov registration number NCT00848406 (registration date February 19, 2009), ClinicalTrials.gov registration number NCT00807469 (registration date December 11, 2008). Electronic supplementary material The online version of this article (10.1186/s12931-017-0696-5) contains supplementary material, which is available to authorized users.

For bronchial brushes (cohort 2), the same linear regression model as used for the nasal gene expression analysis was used to identify differentially expressed genes between COPD and controls.
For both comparator cohorts, bronchial airway gene expression was ranked according to the strength and direction of the association with COPD, and compared to the set of genes significant altered in the nasal epithelial of individuals with COPD compared to those without COPD at FDR < 0.01.

Nasal epithelial brushing
Nasal epithelial brushes in COPD patients were performed at the first study visit with subjects using their daily medication without having received study treatment. First, subjects were asked to blow their nose in order to remove mucus. To numb the nasal mucous membrane, 1 ml lidocaine 1% was sprayed in the right nostril. The lateral area underneath the inferior turbinate was brushed for 3 seconds. Next, the brush was placed in an Eppendorf tube containing RNA-protect fluid (Qiagen, Hilden, Germany) and stored at -80 o Celsius until processing.

RNA isolation
Total RNA was isolated from nasal brushes using the miRNeasy kit (Qiagen, Hilden, Germany) according to the protocol of the manufacturer. In short, QIAzol Lysis Reagent was added to the samples to induce lysis. Then, chloroform was added and after centrifuging the sample, the aqueous phase was mixed with 100% ethanol and transferred to a RNeasy® Mini column.
The sample was washed once with RWT buffer fluid and two times with RPE buffer fluid.
Finally, after adding RNAse-free water, the sample was centrifuged and the RNA fractions were eluted from the column. Quantity and purity of the RNA was assessed with a NanoDrop 1000 UV-Vis spectrophotometer (Thermo Scientific, Breda, Netherlands) and RNA integrity of the fractions was assessed with an Agilent 2100 BioAnalyzer.

RNA processing and microarray hybridization
To minimize technical variation due to batch effects, we evenly distributed nasal epithelial samples from COPD patients and healthy controls across the different batches, thereby taking into account gender, age and smoking status. Affymetrix, Santa Clara, CA): the samples were first stained with streptavidin (SAPE), followed by administration of a biotinylated goat anti-streptavidin antibody in order to induce signal amplification and finally a second SAPE staining. Immediately after staining, the microarrays were scanned using Affymetrix GeneArray Scanner 3000 7G Plus (Affymetrix, Santa Clara, CA). The quality of the microarray hybridization was assessed by means of Normalized Unscaled Standard Error (NUSE) plots and Relative Log Expression (RLE) plots as previously described [1]. To account for technical variation within the microarray data after the quality control, a principal component (PC) analysis was performed on the normalized microarray data of the COPD patients and controls (nasal epithelium) and microarray data of cohort 2 (bronchial epithelium) together. To prevent filtering out important clinical variables with the PC analysis, we first adjusted all data for known confounders (age, sex, smoking status, RNA integrity number (RIN)), disease specific variables (FEV 1 % predicted) and tissue type (i.e. nasal or bronchial). Next, we performed a PC analysis on the residuals of this analysis to identify PCs accounting for technical variation.

Quality control, data normalization and principal component analysis
The number of principal components that together explained at least 50% of the variance of the data, were included as covariates in further analyses. This approach resulted in the inclusion of the first 4 principal components, which together explained 52% of the variance.

Identification cohort: nasal epithelial brushes
Nasal samples from 36 current smokers with COPD and 23 current smokers without COPD were hybridized to microarrays. We excluded 1 microarray from a patient with COPD from the analysis due to missing smoking history information, and 5 additional microarrays (4 COPD patients and 1 control) based on quality metrics.         3.86E-6 7.08E-4 positive regulation of proteolysis involved in cellular protein catabolic process 4.44E-6 8.05E-4 regulation of cellular amine metabolic process 4.72E-6 8.46E-4 nuclear-transcribed mRNA catabolic process, nonsense-mediated decay 5.02E-6 8.88E-4 microtubule organizing center organization 5.04E-6 8.81E-4 translation 5.63E-6 9.74E-4 plasma membrane bounded cell projection assembly 6.12E-6 1.05E-3 microtubule bundle formation 6.55E-6 1.11E-3 regulation of cellular protein catabolic process 7.03E-6 1.17E-3 regulation of transcription from RNA polymerase II promoter in response to hypoxia 7.06E-6 1.17E-3 cell projection assembly 8.21E-6 1.34E-3 negative regulation of transferase activity 8.41E-6 1.36E-3 negative regulation of metabolic process 8.95E-6 1.43E-3 nucleobase-containing compound catabolic process 9.61E-6 1.52E-3 NB: GO terms in bold represent biological processes at the bottom of the hierarchical diagram created by GOrilla, i.e. the most specific pathways. Only pathways with a p-value< p-value < 1E-5 are shown. Figure E4: Example of the leading-edge subset, consisting of those genes that contribute the most to the enrichment of pathways; in the figure the leading-edge genes are ranked before the point at which the running sum (represented by the green line) reaches its highest point.  RFC5