Genetic analyses in a cohort of 191 pulmonary arterial hypertension patients

Background Pulmonary arterial hypertension (PAH) is a progressive and fatal disorder associated with high pulmonary artery pressure. Genetic testing enables early diagnosis and offers an opportunity for family screening. To identify genetic mutations and help make a precise diagnosis, we performed genetic testing in 191 probands with PAH and tried to analyze the genotype-phenotype correlation. Methods Initially, PAH samples (n = 119) were submitted to BMPR2 screening using Sanger sequencing. Later, we developed a PAH panel test to identify causal mutations in 13 genes related to PAH and tried to call BMPR2 copy number variations (CNVs) with the panel data. Multiplex ligation-dependent probe amplification (MLPA) was used to search for CNVs in BMPR2, ACVRL1 and ENG. Notably, EIF2AK4 gene was also involved in the panel, which allowed to distinguish pulmonary veno-occlusive disease (PVOD)/pulmonary capillary hemangiomatosis (PCH) patients from idiopathic PAH (IPAH). Characteristics of patients were compared using t test for continuous variables. Results Pathogenic BMPR2 mutations were detected most frequently in 32 (17.9%) IPAH and 5 (41.7%) heritable PAH (HPAH) patients by sequencing, and 12 BMPR2 CNVs called from the panel data were all successfully confirmed by MLPA analysis. In addition, homozygous or compound heterozygous EIF2AK4 mutations were identified in 6 patients, who should be corrected to a diagnosis of PVOD/PCH. Genotype-phenotype correlation analysis revealed that PAH patients with BMPR2 mutations were younger at diagnosis (27.2y vs. 31.6y, p = 0.0003) and exhibited more severe pulmonary hemodynamic impairment and a worse cardiac index compared with those without BMPR2 mutations. Conclusions The panel assay represented a highly valuable tool in PAH genetic testing, not only for the detection of small sequence alterations, but also for an indication of BMPR2 CNVs, which had implications for the specific samples to perform further MLPA assay. Analyses of PAH causal genes have a great help to clinical diagnosis and deep implications in disease treatment. Electronic supplementary material The online version of this article (10.1186/s12931-018-0789-9) contains supplementary material, which is available to authorized users.


Background
Pulmonary arterial hypertension (PAH) is a progressive and fatal disorder, which is diagnosed by a mean pulmonary artery pressure ≥ 25 mmHg at rest accompanied by a pulmonary artery wedge pressure (PAWP) ≤ 15 mmHg and a pulmonary vessel resistance (PVR) > 3 Wood units (WU), excluding other known causes of pulmonary hypertension, e.g., pulmonary diseases, thromboembolic diseases and left heart diseases [1]. Given the nonspecific clinical manifestations in the early stage, patients with PAH are frequently diagnosed late. Whenever the pulmonary pathologic changes become advanced and irreversible, prognosis and survival are poor and pessimistic. It has been reported that more than 20% patients have a two-year delay in the diagnosis of PAH [2], and if not treated timely, patients might progress into right heart failure [3] and the average survival duration after diagnosis is 2.8 years [4]. Idiopathic PAH (IPAH) corresponds to sporadic disease, without any familial history of PAH or known causes, which has an estimated prevalence of 5.9/ 1,000,000. Heritable PAH (HPAH) is diagnosed when there is a positive family history or when a pathogenic variant has been identified. For individuals with PAH, the most effective therapy to date is continuous injection of prostacyclin analogs. Other options include calcium channel blockers, oral prostanoids, phosphodiesterase type 5 inhibitors, and endothelin receptor antagonists [5]. Lung transplantation is effective, but long-term post-surgical survival is limited [6].
Since the 1990s, much has been learned about the molecular and genetic factors related to PAH. BMPR2 mutation is identified as the main genetic cause of PAH, accounting for 75%-90% of familial PAH [7,8] and 3.5%-40% of sporadic cases [7,9], with an autosomal dominant inheritance pattern [10]. The BMPR2 gene encodes bone morphogenetic protein receptor type II, which belongs to the transforming growth factor-β (TGF-β) superfamily and is involved in the regulation of cell growth and apoptosis. Two other genes, ACVRL1 [11] and ENG [12], which also encode receptors belonging to the TGF-β superfamily, have been more recently identified in PAH accompanying hereditary hemorrhagic telangiectasias (HHT). Similarly to BMPR2, these two genes affect vascular proliferation. With the development of genetics, more genes have been discovered to be implicated in the pathogenesis of PAH, including KCNK3 [13], CAV1 [14], SMAD9 [15], BMPR1B [16], but considerably less commonly so (1%-3%). The discovery of a causative mutation could help make an early diagnosis and provide an opportunity for family screening [17].
Pulmonary veno-occlusive disease (PVOD) is a rare disease that shares several clinical and hemodynamic similarities with IPAH but is distinct from IPAH in pathology and prognosis. An accurate diagnosis of PVOD in the early stage is crucial for suitable drug therapy and timely option for lung transplantation. Histological examination of lung biopsy is the gold standard for the diagnosis of PVOD, but it is often too invasive and unacceptable for the patient. Recent discoveries have indicated that gene mutations in eukaryotic translation initiation factor 2 alpha kinase 4 (EIF2AK4) are responsible for inherited PVOD and pulmonary capillary hemangiomatosis (PCH) [18], and the 2015 ESC/ERS Guidelines declared that identification of a biallelic EIF2AK4 mutation was sufficient to make a diagnosis of PVOD/PCH without histological confirmation [1].
To identify the causal mutations in PAH patients and differentiate PVOD patients from those with IPAH, a custom gene panel targeting PAH and related diseases was used to test 191 probands with clinically suspected IPAH and HPAH. Furthermore, the performance of calling copy number variations (CNVs) from our panel data was evaluated.

Subjects
One hundred and ninety-one PAH patients referred by the Center of Pulmonary Vascular Disease at Fuwai Hospital between 2016 and 2017 were enrolled in our study. All patients underwent a detailed clinical examination, and other known causes of pulmonary hypertension were excluded by an expert physician.

Sanger sequencing
Genomic DNA was extracted from EDTA-anticoagulated whole blood using a Wizard® Genomic Purification System A1125 (Promega, USA) kit according to the manufacturer's recommended protocol. All 13 exons in the BMPR2 coding sequence region and a minimum of 20 base pairs of intronic DNA flanking each exon were amplified by PCR (Tiangen Biotech, Beijing, China). Briefly, the PCR program consisted of initial denaturation for 10 min at 95°C; followed by 30 cycles of 30 s at 95°C, 30 s at 55°C and 30 s at 72°C; and a final extension at 72°C for 7 min. Then, the PCR products were purified with a TIANgel Midi Purification Kit (Tiangen Biotech, Beijing, China) and subsequently sequenced using a BigDye Terminator Cycle sequencing kit V 3.1 (ABI Biosystems) on an ABI PRISM 3730 Sequence Analyzer according to the manufacturer's directions.
Library preparation was performed according to the manufacturer's instructions (NEBNext Ultra DNA Library Prep Kit, Illumina, USA). Briefly, 1 μg of genomic DNA was sheared to 200 bp fragments, ligated with adaptors and size-selected for PCR amplification. Every 4 amplified DNA samples were pooled and captured using the customized gene panel with biotinylated oligos (SeqCap EZ Choice Library, Roche NimbleGen, USA). The enrichment libraries were sequenced by an Illumina MiSeq 2000 sequencer (Illumina, San Diego, CA) using 2 × 150 paired-end sequencing (MiSeq Reagent Kit v2, Illumina, San Diego, CA).

Bioinformatics analysis
Only high-quality reads were retrieved by filtering out low-quality reads and adaptor sequences using Trimmomatic software. The clean read sequences were aligned to the human reference genome (hg19) by Burrow-Wheeler Aligner (BWA), and PCR duplicates were marked by the Picard software. SNPs and insertions/deletions were identified using the GATK HaplotypeCaller program (http://www.broadinstitute.org/gsa/wiki/index. php/Home_Page) and further annotated with comprehensive ANNOVAR software for their frequencies in the Genome Aggregation Database (gnomAD) for the pathogenicity and splicing-altering prediction of single nucleotide variants in the dbNSFP database, which included results from SIFT, Ployphen-2, MutationTaster, the dbscSNV database. In addition, the clinical significance of the sequences was annotated using Clinvar (http:// www.ncbi.nlm.nih.gov/clinvar/), OMIM (http://omim. org/), Uniprot (http://www.uniprot.org/) and HGMD (http://www.hgmd.org). Variants with cutoff values greater than 0.6 in the dbscSNV database were defined as splice-altering. Other synonymous variants that did not fulfill the above mentioned conditions were removed. Considering that most genes related to PAH were autosomal-dominant inherited, variants with a minor allele frequency (MAF) of > 0.1% in panel genes other than EIF2AK4 were excluded from further analysis as polymorphism variants, while the variants in EIF2AK4 with a MAF of > 1% were excluded because of an autosomal-recessive inherited pattern.

Variant classification
Variants were analyzed for pathogenicity according to the recommendations of the American College of Medical Genetics (ACMG). Specifically, the analysis was based on the following criteria: (i) whether they were previously reported by a functional study or family segregation study; (ii) the nature of the variant (e.g., nonsense, frameshift indel, or splicing mutations (intron ±1 or ± 2)); (iii) variant frequency in population databases; (iv) conservation of the altered residue; (v) in silico prediction (SIFT, PholyPhen2, or MutationTaster); (vi) de novo mutation; and (vii) family segregation studies. Based on this information, a variant was classified into one of the 5 following categories: benign, likely benign, unknown significance, likely pathogenic or pathogenic.

Copy-number variation calling
All CNVs were called by panelcn.MOPS [25], which was designed to detect targeted next-generation-sequencing (NGS) panel data. Sequencing quality was checked, and the mean coverage of sequencing was 98.7%. Duplications were removed by the Picard software before CNV calling. Samples with a high correlation of read counts (RCs) were selected automatically as controls from all the samples by panelcn.MOPS. Other parameters were kept default.

Multiplex ligation-dependent probe amplification (MLPA)
To detect the CNVs in BMPR2, the commercially available MLPA kit P093-C2 was used, containing probes for BMPR2, ENG and ACVRL1 (MRC-Holland, Amsterdam, The Netherlands). MLPA was performed according to the manufacturer's protocol (www.mlpa.com), and the generated fragments were detected using an ABI 3500XL DX capillary electrophoresis system (Applied Biosystems). The results were analyzed with the Coffalyser software (MRC-Holland). Values are mean ± SD or n (%). NYHA, New York Heart Association functional class; RAP right atrial pressure, mPAP mean pulmonary artery pressure, PVR pulmonary vascular resistance, PAWP pulmonary artery wedge pressure, CI cardiac index, SvO2 mixed venous oxygen saturation, Peak VO2 peak oxygen consumption, NT-proBNP N-terminal pro-B-type natriuretic peptide

Statistical analysis
Unpaired t-test was used to analyze the difference of continuous variables between different genotypes by GraphPad Prism 5. The data were presented as mean ± standard deviation (SD) and p < 0.05 was considered statistically significant. All statistical tests were two-sided.

Results
A total of 191 patients were enrolled in our study and underwent genetic testing. Of the patients, 179 suffered from IPAH and 12 had HPAH. The main clinical findings are summarized in Table 1.
All samples were completed to perform high-quality genetic testing. Initially, a total of 119 PAH samples were submitted to single-gene testing of BMPR2 by Sanger sequencing. Later, we developed a PAH panel test in our lab. Then 97 samples with negative results in the BMPR2 test and 72 subsequently referred samples were then directed to a panel test involving 13 genes related to PAH. If the samples were still negative, MLPA testing was continued to search for the causative mutation (Additional file 1: Figure S1).
Sequencing of the 13 PAH genes (Additional file 1: Table S1) in the 169 probands yielded a mean depth of 350× and coverage of 98.7%. Exons with low (< 20×) or no coverage were subjected to Sanger sequencing to obtain 100% coverage in the main gene, BMPR2. In addition, potential pathogenic mutations and rare variants of unknown significance (VUS) were confirmed using Sanger sequencing.
Causal mutations identified in the sequencing tests (both BMPR2 sequencing and panel sequencing), except for biallelic EIF2AK4, are described in Table 2. The table indicates that BMPR2 mutations were detected most frequently in 32 (17.9%) IPAH and 5 (41.7%) HPAH patients, while VUS are summarized in Additional file 1: Table S2. Notably, 4 HPAH families in our cohort indicated no suspected mutations either by panel sequencing or MLPA detection (Additional file 1: Figure S2), which required further genetic testing for the identification of new genes potentially related to PAH. In addition, in 6 patients clinically diagnosed with IPAH, homozygous or compound heterozygous EIF2AK4 mutations were detected (Table 3); thus, their diagnosis had to be corrected to PVOD/PCH.
Because deletion/duplication contributed considerably to BMPR2 pathogenic mutations in HPAH, we tried to detect CNVs in targeted NGS panel data. With the newly developed panelcn.MOPS pipeline by Povysil et al. [25], 12 BMPR2 deletions were detected, after excluding those flagged as low quality (Table 4). To confirm the results, MLPA was performed. Almost all BMPR2 CNVs were proved to be true, except for a slight inconsistency in patient PAH52, which panelcn.MOPS suggested only exon 11 del while MLPA revealed an exon 11-12 del in BMPR2, suggesting that the method implemented by panelcn.MOPS was highly effective for BMPR2 CNV detection. Confusingly, CNVs in ENG and ACVRL1 called by panelcn.MOPS were proved to be false (Additional file 1: Table S3), probably because of a high GC content (Additional file 1: Figure S3).
When trying to explore the genotype-phenotype correlation, we found that PAH patients with BMPR2 causal and rare mutations were younger at diagnosis (27.2y vs. 31.6y, p = 0.0003) and had more severe pulmonary hemodynamic impairment and a worse cardiac index (2.6 L/min/m 2 vs. 3.0 L/min/m 2 , p = 0.0017) ( Table 5) compared with those without BMPR2 mutations, in  accord with previous reports [9]. In addition, patients with biallelic EIF2AK4 mutations had a lower diffusion capacity for carbon monoxide of lung (DLCO) than PAH patients (Additional file 1: Table S4).

Discussion
To the best of our knowledge, this was the largest reported cohort of Chinese PAH patients undergoing genetic testing, which greatly contributed to expand PAH genetic database in Asian population. In the present study, 31.3% (56/179) IPAH and 66.7% (8/12) HPAH patients were detected causative mutations. Targeted NGS panel assay is a highly effective clinical tool, which allows not only for the detection of small sequence alterations, but also for an indication of CNVs with specific computational algorithms. Our results showed that 37 BMPR2 mutations (37/49, 75.5%) were identified in 32 IPAH and 5 HPAH by the sequencing methods and 12 BMPR2 CNVs (12/49, 24.5%) were indicated from the panel data and finally validated by MLPA, which was distinct from Cogan's report [8] that a higher frequency of BMPR2 mutation was detected by deletions/duplications analysis in HPAH, compared with that by sequence analysis (48% vs. 37%). Therefore, an indication of CNVs from panel data would largely save costs in testing specified samples. Our results revealed that the newly developed and free available panelcn.MOPS pipeline [25] showed high sensitivity and specificity in exploring BMPR2 CNVs, ranging from single exons to the whole gene, and all were confirmed by MLPA, except for a slight inconsistency in which panelcn.MOPS suggested a single exon 11 del while MLPA revealed an exon 11-12 del in BMPR2 in patient PAH52. Conversely, CNVs detected in the genes ENG and ACVRL1 were all proved to be false, which might be related to a high GC content in these two genes.
Notably, 86 IPAH patients and 4 HPAH families were not detected any suspected mutations in the 13 known genes involved in our panel, indicating that other genes potentially responsible for the disease might exist. In most recent years, more genes were discovered, such as KLF2 [26], SOX17 [27], AQP1 [27]. We would add these novel genes and other candidate genes related to PAH into our panel in the future, such as FOXF1 [28], RASA1 [29], TBX4 [30], ABCA3 [31], SMAD1 [15]. Furthermore,  Values are mean ± SD or n (%). NYHA New York Heart Association functional class, RAP right atrial pressure, mPAP mean pulmonary artery pressure, PVR pulmonary vascular resistance, PAWP pulmonary artery wedge pressure, CI cardiac index, SvO 2 mixed venous oxygen saturation, Peak VO 2 peak oxygen consumption, NT-proBNP N-terminal pro-B-type natriuretic peptide, ns no significance we would also dedicate in novel gene discovering through whole exome sequencing (WES) later on, especially these four HPAH families with negative results in our cohort. PVOD, featured with extensive and diffuse constriction and/or occlusion of pulmonary veins and venules, instead of pulmonary arteriopathy, shares several clinical and hemodynamic similarities with PAH, thus often leading to a misdiagnosis of IPAH. However, PVOD patients usually have a worse prognosis, and more importantly, the administration of PAH specific therapy (vasodilators) can precipitate severe acute pulmonary edema [32]. Hence, a diagnosis of PVOD/PCH distinct from that of PAH appears extremely important and identification of biallelic EIF2AK4 mutations makes it possible. In our study, in 6 patients clinically diagnosed with IPAH, homozygous or compound heterozygous EIF2AK4 mutations were detected; thus, their diagnosis should be corrected to PVOD.
Our clinical data demonstrated that patients with biallelic EIF2AK4 mutations had a lower DLCO value, suggesting a higher probability of biallelic EIF2AK4 detection in PAH patients with DLCO values < 40% of the predicted value. However, because of our limited PVOD sample size, the cut-off value of DLCO of predicted and the corresponding detection rate required further study.
It was reported that PVOD patients carrying biallelic EIF2AK4 mutations had a worse survival compared with IPAH patients [33] and PVOD patients without EIF2AK4 mutations [34], and showed no response to PAH therapy [34]. Inconsistently, neither of our two PVOD patients with intravenous injection of prostacyclin analogues, PAH3 and PAH151, demonstrated pulmonary edema. Regretfully, none of these 6 patients underwent microscopic examination of lung tissue; therefore, we were unaware of the differences that might have existed between PVOD patients who showed a good response to PAHtargeted drugs and those who did not.
With the extensive development of molecular genetics in clinical services, increasing amounts of genetic data have been obtained. Researchers are dedicated to exploring genotype-phenotype correlations in causative genes and PAH phenotypes. A meta-analysis of 1550 PAH patients revealed that PAH patients with BMPR2 mutations presented a younger age at diagnosis with more severe disease and a younger age at death or transplantation compared with non-carriers [9]. PAH patients with ACVRL1 mutations were younger at diagnosis and had a worse prognosis than other PAH patients [35]. In accord with previous reports, our data showed that PAH patients with BMPR2 mutations were younger at diagnosis and presented more severe pulmonary hemodynamic impairment and a worse cardiac index. Probably because of the limited sample size, we did not observe any differences between the ACVRL1 mutation group and other PAH patients (data not shown). The genotypephenotype correlation exploring might help predict prognoses based on genetic results, and lay foundation for further mechanism study and target therapy.

Conclusion
Our data further expand the PAH causative gene mutation spectrum and affirm that the panelcn.MOPS pipeline is efficient for BMPR2 CNV detection using our panel sequencing data in clinical diagnosis, which strongly suggested to perform further MLPA testing in the specific samples. This approach would effectively save costs in the routine clinical diagnosis. The target panel assay undoubtedly represents a highly valuable clinical tool and lays the foundation for further study.

Additional file
Additional file 1: Table S1. PAH panel genes. Table S2. Variants of unknown significance (VUS) detected in the panel genes. Table S3. CNVs in ENG and ACVRL1 by panelcn.MOPS and MLPA. Table S4. Genotypephenotype correlation between biallelic EIF2AK4 mutations carriers and other PAH patients. Figure S1. Molecular genetic testing schedule. Figure S2. 4 HPAH families without an identified causative mutation. Ethics approval and consent to participate Each individual accepting the genetic test was adequately informed and signed a consent form. The study was approved by the ethics committee of Fuwai Hospital (Approval NO.: 2017-877) and adhered to the Declaration of Helsinki.

Consent for publication
Each individual accepting the genetic test signed a consent form and agreed to allow their anonymized samples and genetic results used for further research studies and publications.