Skip to main content

Characterizing the urinary proteome of prematurity-associated lung disease in school-aged children



Although different phenotypes of lung disease after preterm birth have recently been described, the underlying mechanisms associated with each phenotype are poorly understood. We, therefore, compared the urinary proteome for different spirometry phenotypes in preterm-born children with preterm- and term-born controls.


Preterm and term-born children aged 7–12 years, from the Respiratory Health Outcomes in Neonates (RHiNO) cohort, underwent spirometry and urine collection. Urine was analysed by Nano-LC Mass-Spectrometry with Tandem-Mass Tag labelling. The preterm-born children were classified into phenotypes of prematurity-associated preserved ratio impaired spirometry (pPRISm, FEV1 < lower limit of normal (LLN), FEV1/FVC ≥ LLN), prematurity-associated obstructive lung disease (POLD, FEV1 < LLN, FEV1/FVC < LLN) and preterm controls (FEV1 ≥ LLN,). Biological relationships between significantly altered protein abundances were analysed using Ingenuity Pathways Analysis software, and receiver operator characteristic curves were calculated.


Urine was analysed from 160 preterm-born children and 44 term controls. 27 and 21 were classified into the pPRISm and POLD groups, respectively. A total of 785 proteins were detected. Compared to preterm-born controls, sixteen significantly altered proteins in the pPRISm group were linked to six biological processes related to upregulation of inflammation and T-cell biology. In contrast, four significantly altered proteins in the POLD group were linked with neutrophil accumulation. Four proteins (DNASE1, PGLYRP1, B2M, SERPINA3) in combination had an area under the curve of 0.73 for pPRISm and three combined proteins (S100A8, MMP9 and CTSC) had AUC of 0.76 for POLD.


In this exploratory study, we demonstrate differential associations of the urinary proteome with pPRISm and POLD.

Trial registration

EudraCT: 2015-003712-20


Preterm-born children, including those who developed the neonatal lung disease bronchopulmonary dysplasia (BPD, also known as chronic lung disease of prematurity), are at risk of low lung function (prematurity-associated lung disease, PLD) in childhood and beyond. Our recent systematic review reported > 9% difference in percent predicted forced expiratory volume in 1s (FEV1) between all preterm-born and term-born subjects in later life [1], increasing to 16% in those who had BPD in infancy. However, in multivariable regression models, gestational age at birth and intrauterine growth restriction (IUGR) are better predictors of PLD in childhood than BPD [2]. Furthermore, we have recently demonstrated different spirometry PLD phenotypes of prematurity-associated obstructive lung disease (POLD), prematurity-associated preserved ratio-impaired spirometry (pPRISm) and dysanapsis [3]. There is also concern that PLD is associated with early onset of chronic obstructive pulmonary disease (COPD) [4].

Historically, mechanistic studies have focussed on those with BPD in infancy, with evidence of smooth muscle extension into the distal airways in post-mortem samples from infants [5], and peri-bronchial fibrosis and CD8+ T-lymphocyte epithelial infiltrate in adolescent [6] and adult [7] survivors of BPD. A proportion of those with PLD will respond to inhaled therapies [8], however, a clearer understanding of the biological pathways underlying these PLD-associated phenotypes will aid their identification and development of targeted therapy. As urine lacks the same homeostatic mechanisms as blood, systemic protein changes accumulate and the urinary proteome may show alterations prior to clinical manifestations or histopathological changes to the lung tissue, reflecting earlier stages of disease development [9]. Urine proteomics has been used to study adult respiratory diseases [10], as well as BPD [11] and respiratory infections [12] in preterm-born infants, with the advantage that it can be sampled easily and non-invasively. The urinary proteome of neonates who develop BPD has shown increases of proteins associated with leukocyte mediated immunity, but with downregulation of myeloid cell lines and neutrophil degranulation [11], whereas those with infectious vs. non-infectious respiratory disease show differences in proteins related to cell adhesion, enzymatic regulation and inflammatory response [12]. However, to our knowledge, the changes in the urinary proteome in preterm-born individuals with lung function impairment in childhood has yet to be studied. We, therefore, performed exploratory analyses of the urinary proteome in preterm-born, school-aged children, with term-born matched controls, to elucidate the biological mechanisms underlying different PLD phenotypes of pPRISm and POLD.



This study was conducted on a cohort of children recruited to the Respiratory Health Outcomes in Neonates study (RHiNO, EudraCT: 2015-003712-20) which has been described extensively previously [2, 3, 8]. Briefly, children from a previous study [13] were supplemented with additional preterm-born children sourced from NHS Wales healthcare records and sent a respiratory and neurodevelopmental questionnaire if they were born ≤ 34 or ≥ 37 weeks’ gestation and were aged 7–12 years. Children with significant congenital malformations, cardiopulmonary or neuromuscular disease were excluded. Ethical approval was obtained from the South-West Bristol Research Ethics Committee (15/SW/0289). Parents gave informed written consent and children provided assent. Recruitment took place prospectively between November 2016 and September 2019.

Responders underwent spirometry (Microloop, Care Fusion, UK), performed according to ATS/ERS guidelines [14] and results were normalised using Global Lung Initiative (GLI) references [15] by trained research nurses. Any respiratory medications were withheld prior to their assessment (short- and long-acting β2-agonists for 8- and 48-hours respectively; inhaled corticosteroids for 24 h; and leukotriene receptor antagonists for 48 h) and children were free of respiratory infections for at least three weeks prior to testing. Low lung function in preterm-born children (PLD) was defined as FEV1 less than the lower limit of normal (LLN) as per GLI references [15]. Those with PLD were further categorised, as previously described [3], into pPRISm (FEV1 < LLN and FEV1/FVC ≥ LLN), and POLD groups (FEV1 < LLN with an FEV1/FVC < LLN). Preterm-born control (PTc) and term-born children had FEV1 ≥ LLN. BPD was defined as oxygen-dependency of 28-days or greater for those born < 32 weeks’ gestation and at 56 days of age for those born ≥ 32 weeks’ gestation [16]. Intrauterine growth restriction (IUGR) defined as birthweight < 10th percentile adjusted for sex and gestation (LMSgrowth v2.77, Medical Research Council, UK). Neonatal history was corroborated with medical records.

Sample collection and analysis

Urine samples were obtained at the time of spirometry, aliquoted and stored at -80 °C on the day of collection until analysis.

TMT labelling

Urine samples were analysed at the University of Bristol Proteomics Facility. 190 μl of urine was digested with trypsin (1.25 μg trypsin; 37 °C, overnight), labelled with Tandem Mass Tag (TMT) eleven plex reagents according to the manufacturer’s protocol (Thermo Fisher Scientific, Loughborough, UK) and the labelled samples pooled. The pooled sample was desalted using a SepPak cartridge according to the manufacturer’s instructions (Waters, Milford, Massachusetts, USA). Eluate from the SepPak cartridge was evaporated to dryness and resuspended in 1% formic acid prior to analysis by nano-LC MSMS using an Orbitrap Fusion Lumos mass spectrometer (Thermo Scientific).

Nano-LC mass spectrometry

The TMT-labelled pool was fractionated using an Ultimate 3000 nano-LC system in line with an Orbitrap Fusion Lumos mass spectrometer (Thermo Scientific). In brief, peptides in 1% (vol/vol) formic acid were injected onto an Acclaim PepMap C18 nano-trap column (Thermo Scientific). After washing with 0.5% (vol/vol) acetonitrile 0.1% (vol/vol) formic acid peptides were resolved on a 250 mm × 75 μm Acclaim PepMap C18 reverse phase analytical column (Thermo Scientific) over a 150 min organic gradient, using 7 gradient segments (1–6% solvent B over 1 min, 6–15% B over 58 min, 15–32% B over 58 min, 32–40% B over 5 min, 40–90% B over 1 min, held at 90% B for 6 min and then reduced to 1% B over 1 min) with a flow rate of 300 nl min− 1. The TMT-labelled pool underwent a further fractionation to try and maximise peptide yield. The second fractionation used the above methodology again with a different gradient protocol: 6 gradient segments (1–6% solvent B over 1 min, 6–25% B over 118 min, 25–40%B over 3 min, 40–90%B over 1 min, held at 90%B for 6 min and then reduced to 1%B over 1 min.) again with a flow rate of 300 nl min− 1. Solvent A was 0.1% formic acid and Solvent B was aqueous 80% acetonitrile in 0.1% formic acid for both fractionation processes. Peptides were ionized by nano-electrospray ionization at 2.0 kV using a stainless-steel emitter with an internal diameter of 30 μm (Thermo Scientific) and a capillary temperature of 300 °C.

All spectra were acquired using an Orbitrap Fusion Lumos mass spectrometer controlled by Xcalibur 3.0 software (Thermo Scientific) and operated in data-dependent acquisition mode using an SPS-MS3 workflow. FTMS1 spectra were collected at a resolution of 120,000, with an automatic gain control (AGC) target of 200,000 and a maximum injection time of 50ms. Precursors were filtered with an intensity threshold of 5000, according to charge state (to include charge states 2–7) and with monoisotopic peak determination set to Peptide. Previously interrogated precursors were excluded using a dynamic window (60s ± 10ppm). The MS2 precursors were isolated with a quadrupole isolation window of 0.7 m/z. ITMS2 spectra were collected with an AGC target of 10,000, maximum injection time of 70ms and CID collision energy of 35%.

For FTMS3 analysis, the Orbitrap was operated at 50,000 resolution with an AGC target of 50,000 and a maximum injection time of 105ms. Precursors were fragmented by high energy collision dissociation (HCD) at a normalised collision energy of 60% to ensure maximal TMT reporter ion yield. Synchronous Precursor Selection (SPS) was enabled to include up to 5 MS2 fragment ions in the FTMS3 scan. All mass spectrometry runs were performed consecutively on the mass spectrometer with blank runs in between to prevent carry over from one experiment to the next.

Data analysis

The raw data files were processed and quantified using Proteome Discoverer software v2.1 (Thermo Scientific) and searched against the UniProt Human database (downloaded October 2019: 150,786 entries) using the SEQUEST HT algorithm. Peptide precursor mass tolerance was set at 10ppm, and MS/MS tolerance was set at 0.6Da. Search criteria included oxidation of methionine (+ 15.995Da), acetylation of the protein N-terminus (+ 42.011Da) and Methionine loss plus acetylation of the protein N-terminus (-89.03Da) as variable modifications and carbamidomethylation of cysteine (+ 57.021Da) and the addition of the TMT mass tag (+ 229.163Da) to peptide N-termini and lysine as fixed modifications. Searches were performed with full tryptic digestion and a maximum of two missed cleavages were allowed. The reverse database search option was enabled, and all data was filtered to satisfy false discovery rate (FDR) of 5%. A more moderate stringency was applied to reduce the risk of important biological discoveries being classified as false negatives and allowed further consideration of the biological relevance of the protein through statistical and enrichment analysis.

Statistical analysis

Baseline population characteristics were compared using Chi-squared or t-test as appropriate. Replicate numbers, i.e., number of samples in which a particular protein was detected, were calculated. To account for any dilutional effect on a urine sample altering total protein load, protein content was normalised using a central tendency method to the median protein abundance of its respective MS run, as previously described [11, 17]. Relative protein abundances, determined from the quantity of TMT-tag counts at each detected peptides spectral peak, between MS runs were scaled using pool samples. Samples with a total protein abundance ± 2-fold difference from the median MS run protein abundance were excluded to ensure accurate comparative quantitation by Proteome Discoverer v2.1.

Scaled protein abundances were log2-transformed and fold changes (log2FC) between groups were compared using Welch’s t-test. p < 0.05 was considered statistically significant. All analyses were performed using R v4.0.4 (R Foundation for Statistical Computing, Austria). Gene names synonymous with protein names have been used. Functional enrichment analysis (identifying changes in classes of proteins present) was performed with Webgestalt [18]. Ingenuity Pathways Analysis (IPA, Qiagen®, Germany) software identified functional relationships between significantly different protein abundances between groups, highlighting altered biological processes, and calculated activation z-scores, which predict biological process activation/inhibition based on published literature-derived gene-biological function relationships, with z-scores ± 2 considered significant. Receiver Operator Characteristic (ROC) curves were generated for biologically related proteins (identified by IPA), with high replicate numbers, between study groups to assess potential biomarker performance (assessed by area under the curve (AUC)), using two linear models, one based on the whole cohort and one using a leave-one-out cross validation (LOOCV) method. These proteins of interest were also analysed in univariable linear regression models to ascertain associations between these proteins and other early and current life factors. Associations with a p-value < 0.1 were combined into a multivariable model to examine the overall combined influence of each association.



From 768 children (including 565 preterm-born and 203 term-born) recruited to RHiNO, urine samples were analysed from 270 participants. 64 (23.7%) samples were excluded as outliers (sample total protein abundance ± 2-fold difference from median protein abundance of respective MS run, as described above) (Additional file Fig. 1). Demographics of the included 206 participants are shown in Table 1. The demographics were largely similar between the included and excluded participants except rates of asthma were significantly higher and BPD were lower in included POLD group (Additional file Table 1). Preterm-born children were marginally older than the term-born children (mean ± SD 10.4 ± 1.4 years vs. 9.9 ± 1.1, p = 0.02) and had higher rates of asthma (39 (24.2%) vs. 2 (4.3%), p = < 0.001). 47 (29.2%) of the preterm-born subjects had received a neonatal diagnosis of mild/moderate/severe BPD, and 48 (30%) had an FEV1 < LLN. Of those, 27 (56%) were classified as pPRISm and 21 (44%) as POLD.

Table 1 Sample Demographics

Detected urinary proteins

A total of 785 proteins were detected, 735 (93.6%) of which were mapped to published gene names. 129 proteins were common to all samples. Functional enrichment analysis [18] was possible for 681 (86.8%) of the detected proteins (Additional file Fig. 2). 288 proteins were significantly different between any of the phenotype comparisons, and functional enrichment analysis was possible for 255 (88.5%). Overall, an enrichment of proteins related to metabolic processes, hydrolase activity and extracellular space/cell membrane activities was observed in the preterm-born groups.

Comparison between the pPRISm group with preterm- and term-control groups

Table 2 Significantly altered biological processes in pPRISm & POLD vs. PTc identified by IPA software

37 (5.3%) proteins had significantly different abundance when compared to PTc (Fig. 1; Additional file Table 2), and 62 (8.9%) when compared to the term-born group (Fig. 1; Additional file Table 3). 14 proteins were common between the two comparisons. IPA linked 16 significantly altered proteins in pPRISm compared to PTc to six biological processes (Table 2; Fig. 2); Inflammation of body cavity (PGLYRP1, DNASE1, MYH9, SERPINA3, CTSV, AGT, ANXA1, CLEC4G, SCGB1A1, B2M, CD14) (p = 0.042); Apoptosis of myeloid cells (SERPINA3, ANXA1, ANPEP, CD14) (p = 0.038); Quantity of leucocytes (GLA, CLEC11A, PGLYRP1, DNASE1, CTSV, AGT, ANXA1, CLEC4G, SCGB1A1, B2M, ANPEP) (p = 0.038); and Quantity of T-lymphocytes (PGLYRP1, DNASE1, CTSV, AGT, ANXA1, CLEC4G, B2M, ANPEP) (p = 0.015). IPA-calculated activation z-scores suggested upregulation of these processes (Fig. 2). There was also a significant link between these proteins and the quantities of CD4+ (p = 0.008) and CD8+ (p = 0.005) T-lymphocytes, with a suggestion of a downregulation of CD4 + T-lymphocytes (activation z-score − 0.73). IPA analysis of significantly different protein abundances in pPRISm group compared to the Term-born group linked six proteins (AGT, CD14, CSF1, FABP5, HBB, ANXA1) with Synthesis of prostaglandin (p = 0.038, activation z-score 1.23)). Five proteins (PRG2, MGAM, CD14, LGALS3BP, ANXA1) were significantly linked with neutrophil activation (p = 0.038, z-score − 0.64).

Fig. 1
figure 1

Volcano Plots showing significant protein differences for pPRISm and POLD phenotypes compared to PTc and Term groups. Vertical line represents Log2 fold change of 0. Horizontal line equivalent to a p-value of 0.05. Proteins with a significant difference between groups highlighted and labelled with respective gene name. Size of circle relative to replicate number. POLD: Prematurity-related obstructive lung disease. pPRISm: Prematurity-related preserved ratio with impaired spirometry. PTc: Preterm-born controls

Fig. 2
figure 2

Proteins linked with significantly altered biological processes by IPA software within lung function phenotypes (pPRISm and POLD compared to PTc). POLD: Prematurity-related obstructive lung disease. pPRISm: Prematurity-related preserved ratio with impaired spirometry. PTc: Preterm-born controls

ROC analysis (Table 3; Fig. 3) demonstrated that DNASE1, PGLYRP1, B2M and SERPINA3 in combination had the highest predictive ability for identifying pPRISm from within the preterm group (AUC: 0.73 (95% confidence interval 0.61, 0.84), sensitivity 0.80 (0.64, 0.96), specificity 0.73 (0.64, 0.82), p = < 0.001). Using the LOOCV model, the predictive ability of this protein panel was AUC 0.65 (0.52, 0.78), p = 0.01 (Additional file Table 6; Additional file Fig. 3). Results from univariable and multivariable linear regression modelling for these proteins are shown in Table 4. DNASE1, PGLYRP1, B2M remained significantly associated with pPRISm in multivariable modelling (p-values 0.008, 0.011, 0.018 respectively) with B2M also being significantly associated with a history of BPD in the multivariable model (p = 0.003). No other life factors were significantly associated with SERPINA3 on univariable models, with pPRISm being highly significant (p = 0.005).

Table 3 ROC Analysis of high replicate proteins implicated in related biological functions by IPA software. POLD: Prematurity-related obstructive lung disease. pPRISm: Prematurity-related preserved ratio with impaired spirometry. PTc: Preterm-born controls. CI: Confidence Interval. PPV: Positive predictive value. NPV: Negative predictive value
Fig. 3
figure 3

Significantly altered protein abundances in pPRISm vs. PTc comparisons, showing violin plots for (A) DNASE1, (B) PGLYRP1, (C) B2M, and (D) SERPINA3, including comparisons with pPRISm and Term groups. (E) ROC Curve analysis for DNASE1, PGLYRP1, B2M and SERPINA3 in combination for pPRISm vs. PTc. Youden point given. For violin plots, black dot represents mean, bars standard error of the mean. p-values given for between group comparisons. POLD: Prematurity-related obstructive lung disease. pPRISm: Prematurity-related preserved ratio with impaired spirometry. PTc: Preterm-born controls. AUC: Area under the curve

Table 4 Univariable and multivariable linear regression analysis of early and current life factors and proteins of interest in pPRISm compared to PTc

Comparison between the POLD group with preterm- and term-control groups

The POLD group had several significant differences when compared with the PTc group (Table 1) including increased wheeze-ever (85.7% vs. 57.1%, p = 0.027), asthma (52.4% vs. 18.8%, p = 0.001) and IUGR (38.1% vs. 13.4%, p = 0.006). When compared to the pPRISm group (Table 1), POLD had higher wheeze-ever (85.7% vs. 55.6%, p = 0.025) and higher rates of IUGR (38.1% vs. 7.4%, p = 0.009).

44 (6.4%) proteins had a significantly different abundance when compared to PTc (Fig. 1; Additional file Table 4), and 70 (10.1%) when compared to term-born subjects (Fig. 1, Additional file Table 5) with 18 proteins being common within the two comparisons. IPA analyses linked four significantly altered proteins (AGT, CTSC, MMP9, S100A8) to Accumulation of neutrophils when the POLD and PTc groups were compared (p = 0.028, z-score 1.34, Table 2; Fig. 2). IPA linked eight significantly altered proteins with Cellular infiltration by macrophages (AGT, PLAU, C3, MMP9, CSF1, PROCR, IL6ST, PRCP) when the POLD and Term-born groups were compared (p = 0.011, activation z-score 0.59).

ROC analysis (Table 3; Fig. 4) demonstrated that S100A8, MMP9 and CTSC in combination had the highest predictive ability for identifying POLD from within the preterm group (AUC 0.76 (0.63–0.90), sensitivity 0.84 (0.68, 1.00), specificity 0.61 (0.50, 0.72), p = < 0.001). Using the LOOCV model, S100A8, MMP9 and CTSC in combination performed similarly (AUC 0.72 (0.57–0.86), p = 0.002) (Additional file Table 6; Additional file Fig. 3). Results from univariable and multivariable linear regression modelling for these proteins are given in Table 5. No other early or current life factors were significantly associated with S100A8 and CTSC abundance in univariable models. A history of BPD was significantly associated with MMP9 abundance in univariable modelling (p = 0.017) and remained significant in the multivariable model BPD (p = 0.017), along with POLD (p = 0.024).

Fig. 4
figure 4

Significantly altered protein abundances in POLD vs. PTc comparisons, showing violin plots for (A) S100A8, (B) MMP9 and (C) CTSC, including comparisons with pPRISm and Term groups. (D) ROC Curve analysis for S100A8, MMP9 and CTSC in combination for POLD vs. PTc. Youden point given. For violin plots, black dot represents mean, bars standard error of the mean. p-values given for between group comparisons. POLD: Prematurity-related obstructive lung disease. pPRISm: Prematurity-related preserved ratio with impaired spirometry. PTc: Preterm-born controls. AUC: Area under the curve

Table 5 Univariable and multivariable linear regression analysis of early and current life factors and proteins of interest in POLD compared to PTc


In this novel exploratory study, we have characterised the urinary proteome of two phenotypes of PLD, namely pPRISm and POLD, in a large cohort of preterm-born children. We have demonstrated increased abundance of proteins related to inflammatory processes and immune-system function in preterm-born children with low lung function, several years after the initial pulmonary insult occurred in the neonatal period. In those with a pPRISm phenotype, there was evidence of multiple affected biological processes, with ongoing inflammatory process occurring with suggested alteration in T-lymphocyte biology. In contrast, in the POLD group altered biological processes focusing on myeloid cell lines including neutrophil and macrophage activity appear to be affected.

It is apparent that there is greater complexity to PLD, with a need to understand the biological mechanisms or endotypes underlying the different phenotypes of lung disease to understand their pathogenesis. Such identification of endotypes will aid the development of specific targeted therapeutic interventions. For this reason, we have focussed our analysis using current lung function to define the different phenotypes rather than using historical diagnoses of BPD, or other associated life factors, that could influence current lung function. PRISm has recently been described in the adult population to be strongly associated with increased development of COPD, cardiovascular disease and all-cause mortality [19, 20]. We recently reported this specific phenotype in preterm-born children with a different association to bronchodilator response, fraction exhaled nitric oxide (FENO) and early/current life factors compared to POLD and PTc groups [3].

Our urinary proteomic analyses using these phenotypes revealed some interesting hypothesis-generating observations. There were multiple associations with systemic alterations in inflammatory and immune processes postulated in the pPRISm group, with a likely increase in inflammation, overall quantities of leucocytes and, in particular, T-lymphocytes. This observation has recently been corroborated by Um-Bergstrom et al. who reported relative decrease of CD4+ T-cells and increase of CD8+ T-cells in bronchoalveolar lavage fluid from young adults with former BPD, a similar finding to those with COPD. CD8+ T-cells were also negatively correlated with both FEV1 and FEV1/FVC [7]. Adolescent survivors of severe BPD have also been noted to have an increase in bronchial wall CD8 + lymphocytes [6]. A recent urine metabolomic study has linked early increases in proteins associated with leukocyte mediated immunity to the later development of BPD in infants born < 29 weeks’ gestation [11]. Our data also suggested that CD4+ lymphocytes may be downregulated in the pPRISm group, of which only 25.9% had a previous history of BPD. A relative increase in CD8+ T-cells number and function has also been associated with severity of COPD [21], and in adult subjects, PRISm is a known risk factor for development of COPD [22]. Four proteins (DNASE1, PGLYRP1, B2M and SERPINA3) showed good predictive ability for identifying pPRISm from PTc in ROC analysis. Deoxyribonuclease-1 (DNASE1) is a ubiquitous endonuclease which degrades the majority of circulating free DNA released from apoptosis and necrotic cell death, with DNASE1 deficiency being previously reported to be associated with autoimmune disease in animal models and humans [23]. Peptidoglycan recognition protein 1 (PGLYRP1), an innate proinflammatory and antibacterial protein, has been linked with asthma in animal models, with PGLYRP1-deficient mice exhibiting a decreased Th2/CD4+ response, with a less severe phenotype [24]. Increased serum beta-2-microglobulin (B2M), the light chain of the class I major histocompatibility complex, has been linked with COPD disease progression, namely development of pulmonary fibrosis, alveolar wall thickening and decreased gas exchange capacity [25]. The anti-protease alpha-1-antichymotrypsin (SERPINA3) manipulates the immune and inflammatory response through inhibition of chymotrypsin and cathepsin G. Previous studies have identified increased SERPINA3 in serum from COPD subtypes associated with metabolic syndrome [26], with genetic mutations resulting in SERPINA3 deficiency resulting in milder disease in patients with COPD [27] and cystic fibrosis(CF) [28]. The reduced abundances, we observed, of these four proteins in pPRISm were all significantly linked with a possible upregulation of inflammatory processes, with DNASE1, PGLYRP1 and B2M also being significantly linked with T-cell biology.

IPA analyses of increased abundance of the four proteins (MMP9, AGT, S100A8, CTSC) in the POLD group (when compared to PTc) suggested increased neutrophil accumulation, which is a reasonable hypothesis given the association of neutrophilic inflammation with wheezing in asthma [29]. Whether this is a specific phenotype of PLD or has similarities with neutrophilic asthma will need further investigation. Matrix metalloproteinase-9 (MMP9) is a gelatinase protease, stored in neutrophils, involved in activating proinflammatory cytokines, enhancing inflammatory cell migration, and degradation of the extracellular matrix. Increased MMP9 in respiratory samples has been linked with several lung diseases, including paediatric patients with acute respiratory distress syndrome (ARDS) [30]. Increased MMP9 has also been observed in preterm-born neonates who subsequently develop BPD [31, 32], including a recent urine metabolomic study where early increase in MMP9 had a high predictive ability for development of BPD in extremely preterm infants [11]. MMP9 had a significant association with BPD in our cohort in univariable modelling, which remained in our multivariable regression model, along with a significant association with POLD. We have previously shown a significant association between BPD and the development of a POLD phenotype [3]. In older subjects, elevated serum MMP9 has been linked with COPD exacerbations [33] and FEV1 decline in CF [34]. Cathepsin C (CTSC) is a serine protease released by neutrophils that can result in increased tissue-degradation, being implicated in the pathophysiology of pneumonia and ARDS in mechanically ventilated adults [35]. S100A8 is also associated with acute lung injury, being secreted by degranulating neutrophils and bronchial epithelium during infection/inflammation [36]. It has been shown to be increased in lung diseases resulting in tissue remodelling, including in bronchiolitis obliterans in children [37], and in adults with CF and COPD [38]. These three proteins all have a role in tissue remodelling; we have recently reported that the POLD group has significantly altered ventilation mechanics on hyperpolarised 129Xe ventilation and diffusion MRI imaging [39], which is likely to be related to tissue remodelling. MMP9, S100A8 and CTSC in combination had good predictive ability for identifying the POLD group using ROC analysis. Whether these combinations of proteins have prospective predictive value for PLD phenotypes prior to the development of lung function deficits will require further work.

In this study, we have analysed the urinary proteome. Whilst this is not a lung-specific sample type, it is easily and non-invasively obtainable, and has previously been utilised in the study of respiratory diseases in neonates [11] and adults [10]. In addition, as urine lacks the same homeostatic controls as blood, proteome changes in urine may be detectable at an earlier stage of disease [9] which makes it an attractive sample type to study in preterm-born children, as they may be at a milder or pre-symptomatic stage of respiratory impairment, as their lung function continues to develop through adolescence into adulthood [40].

This study represents one of the largest proteomic analyses of urine in the paediatric population, and although lung dysfunction was present in approximately 30% of the preterm-born group, this is the first study to our knowledge that has examined the urinary proteome of this cohort. Our regression modelling has demonstrated that many of the protein changes we have seen are primarily related to current lung function phenotype. We have used a robust TMT-methodology to quantify protein abundances and allow accurate comparisons between phenotypes, however there may have been proteins with low abundances/low TMT-tag counts that did not reach the limit of detection of the mass spectrometer. To ensure accurate quantitation with Proteome Discoverer software we excluded a number of samples to ensure robust findings, however we saw minimal significant differences in the participant characteristics between included and excluded samples. Whilst our TMT-based methodology gives robust protein abundances for comparative purposes, it does not give absolute protein concentrations within a sample, which would need to be determined to directly apply this data clinically. Whilst our study lacks a validation cohort, we are limited by the number of available large cohorts of preterm-born children who experienced a contemporary standard of neonatal care from which to sample.

In conclusion, we have demonstrated distinct changes in the urinary proteome associated with the two recently described phenotypes of PLD; POLD and pPRISm. There was suggestion of proteins associated with the inflammatory and immune systems in the pPRISm group and of potential neutrophilic inflammation in the POLD group. We have also demonstrated potential predictive ability of combinations of proteins to identify the POLD and pPRISm phenotypes. Further work with specific targeting of these proteins is now required to confirm if these proteins can be used clinically to screen prospectively for preterm-born children at risk of future lung dysfunction, or whether they can be targeted therapeutically.

Data Availability

The data generated and analysed that support the findings of this study are included in this published article [and its supplementary information file, Additional File 1.pdf]. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifier PXD042759. Further data from the RHiNO study are available to research collaborators subject to confidentiality and non-disclosure agreements. Contact Professor Sailesh Kotecha ( for any data requests.


  1. Kotecha SJ, Gibbons JTD, Course CW, Evans EE, Simpson SJ, Watkins WJ et al. Geographical differences and temporal improvements in forced expiratory volume in 1 Second of Preterm-Born Children: a systematic review and Meta-analysis. JAMA Pediatr. 2022.

  2. Hart K, Cousins M, Watkins WJ, Kotecha SJ, Henderson AJ, Kotecha S. Association of early-life factors with prematurity-associated lung disease: prospective cohort study. Eur Respir J. 2022;59(5).

  3. Cousins M, Hart K, Kotecha SJ, Henderson AJ, Watkins WJ, Bush A et al. Characterising airway obstructive, dysanaptic and PRISm phenotypes of prematurity-associated lung disease. Thorax. 2023.

  4. Bolton CE, Bush A, Hurst JR, Kotecha S, McGarvey L. Lung consequences in adults born prematurely. Thorax. 2015;70(6):574–80.

    Article  PubMed  Google Scholar 

  5. Bush A, Busst CM, Knight WB, Hislop AA, Haworth SG, Shinebourne EA. Changes in pulmonary circulation in severe bronchopulmonary dysplasia. Arch Dis Child. 1990;65(7):739–45.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Galderisi A, Calabrese F, Fortarezza F, Abman S, Baraldi E. Airway Histopathology of Adolescent Survivors of Bronchopulmonary Dysplasia. J Pediatr. 2019;211:215–8.

    Article  PubMed  Google Scholar 

  7. Um-Bergström P, Pourbazargan M, Brundin B, Ström M, Ezerskyte M, Gao J et al. Increased cytotoxic T-cells in the airways of adults with former bronchopulmonary dysplasia. Eur Respir J. 2022;60(3).

  8. Goulden N, Cousins M, Hart K, Jenkins A, Willetts G, Yendle L et al. Inhaled Corticosteroids alone and in Combination with Long-Acting beta2 receptor agonists to treat reduced lung function in Preterm-Born Children: a Randomized Clinical Trial. JAMA Pediatr. 2021.

  9. Wu J, Gao Y. Physiological conditions can be reflected in human urine proteome and metabolome. Expert Rev Proteomics. 2015;12(6):623–36.

    Article  CAS  PubMed  Google Scholar 

  10. Martelo-Vidal L, Vázquez-Mera S, Miguéns-Suárez P, Salgado-Castro FJ, Blanco-Aparicio M, Mosteiro-Añón M, et al. Application of urinary proteomics for biomarker discovery in respiratory diseases. Eur Respir J. 2022;60(suppl 66):1672.

    Google Scholar 

  11. Ahmed S, Odumade OA, van Zalm P, Smolen KK, Fujimura K, Muntel J, et al. Urine proteomics for noninvasive monitoring of biomarkers in Bronchopulmonary Dysplasia. Neonatology. 2022;119(2):193–203.

    Article  CAS  PubMed  Google Scholar 

  12. Starodubtseva NL, Kononikhin AS, Bugrova AE, Chagovets V, Indeykina M, Krokhina KN, et al. Investigation of urine proteome of preterm newborns with respiratory pathologies. J Proteom. 2016;149:31–7.

    Article  CAS  Google Scholar 

  13. Edwards MO, Kotecha SJ, Lowe J, Richards L, Watkins WJ, Kotecha S. Management of Prematurity-Associated Wheeze and its Association with Atopy. PLoS ONE. 2016;11(5):e0155695.

    Article  PubMed  PubMed Central  Google Scholar 

  14. Miller MR, Hankinson J, Brusasco V, Burgos F, Casaburi R, Coates A, et al. Standardisation of spirometry. Eur Respir J. 2005;26(2):319–38.

    Article  CAS  PubMed  Google Scholar 

  15. Quanjer PH, Stanojevic S, Cole TJ, Baur X, Hall GL, Culver BH, et al. Multi-ethnic reference values for spirometry for the 3-95-yr age range: the global lung function 2012 equations. Eur Respir J. 2012;40(6):1324–43.

    Article  PubMed  PubMed Central  Google Scholar 

  16. Ehrenkranz RA, Walsh MC, Vohr BR, Jobe AH, Wright LL, Fanaroff AA, et al. Validation of the National Institutes of Health consensus definition of bronchopulmonary dysplasia. Pediatrics. 2005;116(6):1353–60.

    Article  PubMed  Google Scholar 

  17. Callister SJ, Barry RC, Adkins JN, Johnson ET, Qian WJ, Webb-Robertson BJ, et al. Normalization approaches for removing systematic biases associated with mass spectrometry and label-free proteomics. J Proteome Res. 2006;5(2):277–86.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Liao Y, Wang J, Jaehnig EJ, Shi Z, Zhang B. WebGestalt 2019: gene set analysis toolkit with revamped UIs and APIs. Nucleic Acids Res. 2019;47(W1):W199–W205.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Higbee DH, Granell R, Davey Smith G, Dodd JW. Prevalence, risk factors, and clinical implications of preserved ratio impaired spirometry: a UK Biobank cohort analysis. Lancet Respir Med. 2022;10(2):149–57.

    Article  CAS  PubMed  Google Scholar 

  20. Wan ES, Balte P, Schwartz JE, Bhatt SP, Cassano PA, Couper D, et al. Association between preserved ratio impaired spirometry and clinical outcomes in US adults. JAMA. 2021;326(22):2287–98.

    Article  PubMed  PubMed Central  Google Scholar 

  21. Williams M, Todd I, Fairclough LC. The role of CD8 + T lymphocytes in chronic obstructive pulmonary disease: a systematic review. Inflamm Res. 2021;70(1):11–8.

    Article  CAS  PubMed  Google Scholar 

  22. Marott JL, Ingebrigtsen TS, Colak Y, Vestbo J, Lange P. Trajectory of preserved ratio impaired spirometry: natural history and long-term prognosis. Am J Respir Crit Care Med. 2021;204(8):910–20.

    Article  PubMed  Google Scholar 

  23. Keyel PA. Dnases in health and disease. Dev Biol. 2017;429(1):1–11.

    Article  CAS  PubMed  Google Scholar 

  24. Yao X, Gao M, Dai C, Meyer KS, Chen J, Keeran KJ, et al. Peptidoglycan recognition protein 1 promotes house dust mite-induced airway inflammation in mice. Am J Respir Cell Mol Biol. 2013;49(6):902–11.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Wu Z, Yan M, Zhang M, Wu N, Ma G, Wang B, et al. beta2-microglobulin as a biomarker of pulmonary fibrosis development in COPD patients. Aging. 2020;13(1):1251–63.

    Article  PubMed  PubMed Central  Google Scholar 

  26. Zhang Z, Wang J, Li Y, Liu F, Chen L, He S, et al. Proteomics and metabolomics profiling reveal panels of circulating diagnostic biomarkers and molecular subtypes in stable COPD. Respir Res. 2023;24(1):73.

    Article  PubMed  PubMed Central  Google Scholar 

  27. Sandford AJ, Chagani T, Weir TD, Pare PD. Alpha 1-antichymotrypsin mutations in patients with chronic obstructive pulmonary disease. Dis Markers. 1998;13(4):257–60.

    Article  CAS  PubMed  Google Scholar 

  28. Mahadeva R, Sharples L, Ross-Russell RI, Webb AK, Bilton D, Lomas DA. Association of alpha(1)-antichymotrypsin deficiency with milder lung disease in patients with cystic fibrosis. Thorax. 2001;56(1):53–8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Ray A, Kolls JK. Neutrophilic inflammation in Asthma and Association with Disease Severity. Trends Immunol. 2017;38(12):942–54.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Kong MY, Gaggar A, Li Y, Winkler M, Blalock JE, Clancy JP. Matrix metalloproteinase activity in pediatric acute lung injury. Int J Med Sci. 2009;6(1):9–17.

    Article  CAS  PubMed  Google Scholar 

  31. Sweet DG, Curley AE, Chesshyre E, Pizzotti J, Wilbourn MS, Halliday HL, et al. The role of matrix metalloproteinases – 9 and – 2 in development of neonatal chronic lung disease. Acta Paediatr. 2004;93(6):791–6.

    Article  CAS  PubMed  Google Scholar 

  32. Davies PL, Spiller OB, Beeton ML, Maxwell NC, Remold-O’Donnell E, Kotecha S. Relationship of proteinases and proteinase inhibitors with microbial presence in chronic lung disease of prematurity. Thorax. 2010;65(3):246–51.

    Article  PubMed  Google Scholar 

  33. Wells JM, Parker MM, Oster RA, Bowler RP, Dransfield MT, Bhatt SP et al. Elevated circulating MMP-9 is linked to increased COPD exacerbation risk in SPIROMICS and COPDGene. JCI Insight. 2018;3(22).

  34. Devereux G, Steele S, Jagelman T, Fielding S, Muirhead R, Brady J, et al. An observational study of matrix metalloproteinase (MMP)-9 in cystic fibrosis. J Cyst Fibros. 2014;13(5):557–63.

    Article  CAS  PubMed  Google Scholar 

  35. Seren S, Derian L, Keles I, Guillon A, Lesner A, Gonzalez L et al. Proteinase release from activated neutrophils in mechanically ventilated patients with non-COVID-19 and COVID-19 pneumonia. Eur Respir J. 2021;57(4).

  36. Kotsiou OS, Papagiannis D, Papadopoulou R, Gourgoulianis KI. Calprotectin in Lung Diseases. Int J Mol Sci. 2021;22(4).

  37. Jerkic SP, Michel F, Donath H, Herrmann E, Schubert R, Rosewich M, et al. Calprotectin as a new sensitive marker of neutrophilic inflammation in patients with Bronchiolitis Obliterans. Mediators Inflamm. 2020;2020:4641585.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Lorenz E, Muhlebach MS, Tessier PA, Alexis NE, Duncan Hite R, Seeds MC, et al. Different expression ratio of S100A8/A9 and S100A12 in acute and chronic lung diseases. Respir Med. 2008;102(4):567–73.

    Article  PubMed  Google Scholar 

  39. Chan HF, Smith LJ, Biancardi AM, Bray J, Marshall H, Hughes PJC, et al. Image phenotyping of Preterm-Born Children using Hyperpolarized (129)xe lung magnetic resonance imaging and multiple-breath washout. Am J Respir Crit Care Med. 2023;207(1):89–100.

    Article  CAS  PubMed  Google Scholar 

  40. Belgrave DCM, Granell R, Turner SW, Curtin JA, Buchan IE, Le Souef PN, et al. Lung function trajectories from pre-school age to adulthood and their associations with early life factors: a retrospective analysis of three population-based birth cohort studies. Lancet Respir Med. 2018;6(7):526–34.

    Article  PubMed  Google Scholar 

Download references


We are extremely grateful to all the children and families who took part in the RHiNO study for all their support and enthusiasm and to the nursing staff who were responsible for assessing and collecting the data. We also thank the neonatal consultants and administration teams in the South Wales neonatal units for assisting in identifying the children and for acquiring the medical notes.


The study was funded by the Medical Research Council (MR/M022552/1).

Author information

Authors and Affiliations



SK conceived and designed the study. MC, KH and SK were involved in identifying and assessing the children and in sample collection. KJH was involved with sample analysis. CWC, PAL, SJK, WJW, KJH, and SK were involved in the data analysis and interpretation. CWC and SK drafted the manuscript. All authors were involved in revising the manuscript and approved the final submitted version.

Corresponding author

Correspondence to Sailesh Kotecha.

Ethics declarations

Ethics approval and consent to participate

Ethical approval was obtained from the South-West Bristol Research Ethics Committee (15/SW/0289). Parents gave informed written consent and children provided assent.

Consent for publication

Not applicable.

Competing interests

SK reports grants from Medical Research Council and from GSK for conduct of the current study; and grants from NIHR/HTA, Aspire Pharma and Moulton Foundation outside the submitted work. SJK and WJW report grants from Moulton Foundation outside of this work.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Course, C.W., Lewis, P.A., Kotecha, S.J. et al. Characterizing the urinary proteome of prematurity-associated lung disease in school-aged children. Respir Res 24, 191 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: