Pharmacogenetics, pharmacogenomics and airway disease

The availability of a draft sequence for the human genome will revolutionise research into airway disease. This review deals with two of the most important areas impinging on the treatment of patients: pharmacogenetics and pharmacogenomics. Considerable inter-individual variation exists at the DNA level in targets for medication, and variability in response to treatment may, in part, be determined by this genetic variation. Increased knowledge about the human genome might also permit the identification of novel therapeutic targets by expression profiling at the RNA (genomics) or protein (proteomics) level. This review describes recent advances in pharmacogenetics and pharmacogenomics with regard to airway disease.


Introduction
The recent publication of two draft sequences for the human genome, together with rapidly increasing knowledge of the extent of genetic variability between individuals available from resources such as the SNP Consortium (in which SNP stands for single-nucleotide polymorphism), has major implications for the study of respiratory disease. Genetic variability between individuals in drug-metabolising enzymes or in the primary targets for drugs might account in part for inter-individual variability in treatment response. Research in this area is covered by the broad term pharmacogenetics. In addition, knowledge of the primary sequence of the approximately 30,000 genes in the human genome will permit the identification of novel genes that might be important in disease aetiology or progression and might be potential targets for therapeutic agents. Expression-profiling approaches to the identification of targets for new treatments is covered by the broad term pharmacogenomics. This review covers some of the fundamental issues important in these two developing branches of research.

Pharmacogenetics Polymorphic variation in the human genome
Genetic variability at the DNA level occurs in approximately 1 in 500 to 1 in 1000 bases of coding DNA and in 1 in 300 to 1 in 500 bases in non-coding DNA [1]. These rates are averages across the human genome but it is clear that, when specific short regions of DNA are considered, the rates of polymorphism can be much higher or lower. The vast majority of variation is due to substitutions of one base at a specific site (i.e. an SNP). However, other variations are possible, including deletions, insertions and the expansion of tandem repeat sequences. One important consequence of the insertion or deletion of even a single base pair within coding regions is the subsequent frame shift introduced downstream. Because the amino acid sequence of a protein is determined at the DNA level by groups of three base pairs coding for each amino acid, introducing a single additional base changes the 'reading frame' downstream of this site, thus resulting in an alteration in the amino acid sequence in the protein. This Review Pharmacogenetics, pharmacogenomics and airway disease (page number not for citation purposes) Respiratory Research Vol 3 No 1 Hall frameshift will also disrupt downstream stop codons such that the protein might be truncated or extended, depending on where new stop codons occur.
The functionality of any given polymorphism depends on its nature and position. Thus SNPs in non-coding regions are likely to be non-functional in the main, although if they either interfere with recognised consensus sequences for the binding of transcription factors or alter enhancer elements or splice signals they can have effects on the level of expression of downstream genes. Within coding regions, SNPs are more likely to have functional effects if they occur in the first or second base pair of a codon; redundancy in the amino acid coding system means that the third base pair can in some cases be altered without changing the amino acid sequence of the protein. Thus, polymorphism at the DNA level can be either synonymous or non-synonymous, the latter implying that the polymorphism produces an amino acid substitution in the relevant protein.
Amino acid substitutions themselves can be considered to be conservative or non-conservative, depending on whether they alter the charge or the size of the substituted group. Again, one can predict that non-conservative amino acid substitutions would be more likely to have a direct functional effect than conservative substitutions because the three-dimensional structure of the protein or the charge distribution around important functional epitopes is more likely to be affected. As mentioned above, insertions and/or deletions are more likely than SNPs to produce functional effects within coding regions because they will disrupt the amino acid sequence of the protein.
Although most SNPs within the human genome are unlikely to produce functional effects directly, they can still be used as markers for genes of interest. This is because linkage disequilibrium extends over short distances [2] in the human genome, even in outbred populations; thus polymorphisms within the immediate vicinity of a given gene are likely to be non-randomly associated. Although many studies so far have used individual SNPs or other polymorphisms to assess functional end points (such as a clinical response in a phase 3 trial), the use of a nonfunctional polymorphism as a marker will give useful information only if that marker is in relatively tight linkage disequilibrium with the functionally relevant polymorphisms within the gene of interest. This could occur in two ways.
Firstly, a single mutation with a marked functional effect might have associated SNPs nearby, which will also show association with clinical end points because of linkage disequilibrium. In this situation the tightest association would be with the functionally relevant polymorphism, with association weakening as SNPs farther from the functionally important polymorphism are considered.
Secondly, multiple polymorphisms, each with a relatively small effect, might occur in combinations in which the combination has a particularly deleterious or beneficial associated phenotype. In this case haplotype analysis (i.e. looking at combinations of polymorphisms across the site) will give the most accurate information.
In practice, one would predict that linkage disequilibrium would be directly related to the distance between individual markers. However, this is not necessarily always true, presumably because of the different evolutionary time points at which polymorphisms have arisen and random differences in the rate of genetic drift, so that one can sometimes see tighter linkage disequilibrium with markers that are not adjacent than with adjacent markers (see, for example, [3]). In addition, recombination rates vary across genomic regions.

Pharmacogenetics of airway treatment targets
Several primary targets for treatment of airway disease have been screened for polymorphic variation. The majority of data are from Caucasian populations and it is important to remember that differences in the prevalence of given polymorphisms can occur when populations with different ethnic backgrounds are studied. The main targets of currently available drugs which have been screened for polymorphic variation are shown in Table 1.
It is immediately clear that whereas some primary targets contain extensive polymorphic variation (such as the β 2 adrenoceptor) [4,5], others show far fewer degrees of polymorphism (such as the muscarinic M 3 receptor). Whereas for these less polymorphic genes there might be polymorphic variation in regulatory regions or in different population groups that have not yet been adequately studied, it seems that large differences in the amount of variability can exist in genes of similar sizes. The explanation for this is unclear but the variability is unlikely to be accounted for by evolutionary history (in other words, the time at which the receptor subtype or enzyme isoform arose). One possible explanation is that at least some of these variants have been driven by selection pressures (such as resistance to infection), although obviously this would not be related to treatment response in itself. There might also be selective constraints on given genes, resulting in lower or higher rates of variation occurring within them.
For airway disease targets, by far the best-studied primary target is the human β 2 -adrenoceptor. This is known to contain at least 17 SNPs within a 3-kilobase region including its regulatory regions and coding region [4][5][6]. Five of the nine polymorphisms in the coding region are degenerate but four result in amino acid substitutions within the protein [4]. Expression studies in which the different polymorphic variants of the receptor have been expressed in fibroblast lines have shown altered agonist binding (Thr164→Ile variant) [7] and altered downregulation profiles (Arg16→Gly; Gln27→Glu) [8]. Studies with cultured airway smooth muscle isolated from human lungs have shown similar data, at least for the codon 16 and 27 variants, although analysis is complicated by linkage disequilibrium effects with other polymorphisms within this locus in these constitutively expressing systems [9], and not all published data are consistent [10].
Many clinical studies have now been performed that examine the potential effects of these polymorphisms [11][12][13][14][15][16][17][18][19][20][21][22][23][24][25] and in general they have shown relatively small effects, although there are reasonably convincing data supporting reduced bronchodilator responses in individuals carrying the Gly16 allele [13,16,17,25]. However, recent studies have suggested that the haplotype across this region might in fact be the most important determinant of response [6] . If this proved to be correct, it would imply that the second of the models discussed above for multiple polymorphisms within a locus seems to hold true for this gene. Whether or not treatment response can be adequately predicted prospectively by a knowledge of genotype and/or haplotype remains to be formally established.
The second gene for which reasonable data exist is the gene coding for 5-lipoxygenase. Insertions or deletions within the promoter region for this gene, which encodes recognition sites for the transcription factor SP1, alter the level of transcription of the 5-lipoxygenase gene and hence the 5-lipoxygenase activity present within tissue [26][27][28]. In a study with a 5-lipoxygenase inhibitor, response to treatment was shown to be related to genotype; individuals having alleles associated with low transcriptional activity of the gene showed little or no response to treatment with a 5-lipoxygenase inhibitor [27]. Preliminary data suggest that clinical response to Cys-leukotriene receptor antagonists might also be predicted by this polymorphism.
Data on the majority of other primary airway targets are less extensive and few clinical studies have been performed so far. Certainly for some targets it seems unlikely that clinical response is related to genetic variation; the muscarinic M3 receptor has not so far been found to contain any common coding-region polymorphisms [29] and the extent of polymorphic variation within both the histamine H1 receptor and the Cys-leukotriene 1 receptor is much lower than that of the β 2 adrenoceptor [30]. In contrast, aspirin-sensitive asthma has been linked to a polymorphism in the leukotriene C4 synthase gene, and some supporting evidence exists at a clinical level [31].
One attractive target for pharmacogenetic studies is the glucocorticoid receptor. Perhaps surprisingly, given clear evidence of variable response to glucocorticoids (particularly in asthma), relatively little is known about genetic variability in the receptor and response to treatment. One nondegenerate polymorphism (Asp363→Ser) has been identified, but this is relatively rare; nevertheless, individuals with this polymorphism might be expected to show an enhanced response [32]. No mutations predicting glucocorticoid 'resistance' have yet been identified [33,34].
In addition to the primary target for drugs, downstream signalling pathways will also contain proteins that might show polymorphic variation. Far less is known about the potential contribution of these components to pharmacogenetic variability at present. However, it seems likely that the true profile of an individual in terms of response to a given agent is determined by a combination of a polymorphic variation present at different parts of the signal transduction cascade mediating the effect of that drug. Preliminary evidence that this is important can be seen Available online http://respiratory-research.com/content/3/1/10 Table 1 Selected genes in which polymorphic variation could contribute to variability in treatment response in asthma (adapted from [ from information on the interleukin-4 (IL-4) system. Polymorphic variation exists in the IL-4 gene itself, in the α subunit of the receptor (IL-4Rα) and in downstream signalling pathways (reviewed in [35]). Thus, the true phenotype of an individual in terms of his or her IL-4 responsiveness probably depends on a combination of genetic variables in all of these components of the signal transduction pathway.
One further important aspect of pharmacogenetics in general is the influence of polymorphism in drugmetabolising enzymes on pharmacokinetics (reviewed in [36]). For most airway drugs, cytochrome P450 polymorphism is relatively unimportant in clinical terms, although there are data to show that nicotine dependence is controlled in part by cytochrome P450 2D6 status [37].

Pharmacogenomics
Whereas pharmacogenetics deals with the influence of genetic variability on treatment response or the risk of serious adverse reactions to drugs, pharmacogenomics involves using molecular approaches to identify potential novel targets for drug design. Traditionally, drug discovery programmes have been based on the high-throughput screening of likely targets with the aim of identifying smallmolecule antagonists or agonists at appropriate targets.
Obviously this approach requires a prior knowledge of the target. However, many of the 30,000 genes within the human genome code for novel proteins that could also be important targets for drug development. Without prior knowledge of the function of these gene products, classical pharmacological approaches are not feasible. Pharmacogenomic approaches are designed to identify which novel gene products might potentially be important.
The recent description of a draft sequence for the human genome will provide a further impetus to studies in this area [1].
Current approaches to pharmacogenomics depend on comparing expression profiles at the RNA (genomics) or protein (proteomics) level for a given tissue or cell type after a relevant stimulus. In principle this approach can be used to explore which genes are upregulated or downregulated in an inflamed airway by comparing the expression profiles in tissue taken from affected and unaffected individuals. The potential difficulty with this approach is that small variations in the cellular constituents of the tissue might produce large fluctuations in RNA and/or protein, giving rise to false positive (or negative) data.
Another problem is that the logistical difficulties of dealing with data on many gene products (which by definition have no known function) are considerable. These problems can be avoided to some extent by simplifying the experimental paradigm. For example, one approach that our group has recently adopted is to use cultured human airway smooth muscle cells from a single individual and then to compare expression profiles after treatment with pro-inflammatory and anti-inflammatory drugs.
A third approach is to attempt to combine classical genetic and pharmacogenomic methodologies. For example, one could examine the expression profile of novel genes in tissue from individuals with and without a respiratory disease (such as asthma) and then prioritise those novel gene products identified by studying genes that map to regions of potential linkage from the genome screens that have been performed so far. This approach presupposes that drug targets are likely to be genes important in the initiation of the diseases itself (otherwise they would not be identified in genome screen approaches).

RNA profiling
The concept of comparing expression profiles at the RNA level is not new, and differential-display approaches have been around for at least 10 years. The difficulty with the original approaches was, however, that it was time-consuming and problematic to identify potentially novel transcripts. The field has moved rapidly forwards with the development of arrays of sequence-verified clones relating to genes in the human genome that have been identified as a result of the human genome project [38]. These arrays can be made on membranes, on glass slides or on 'chips'. The approach here is to hybridise RNA extracted from the tissue or cell, with or without disease or treatment, on parallel arrays and then to compare their expression profiles. At present the availability of arrays is heavily dependent on the commercial sector, with many companies having in-house databases detailing the sequences relating to their arrays. It is to be hoped that, with time, this information will increasingly be held in the public domain. The capacity for profiling novel genes is extremely high, with micro-arrays or chips often holding several thousand clones. The unit cost of performing these kinds of experiment is also falling rapidly, with the result that the technology will be available to many more investigators in the academic sector.

Protein profiling
Although a knowledge of RNA expression profiles is clearly important, a knowledge of change at the protein level, be it either in the amount of protein produced or in post-translational modifications, is a step closer to true function. This has led to the development of methods to assess protein expression profiles from cell or tissue lysates. Again, tissue or cells from diseased and unaffected individuals are used to prepare protein lysates, and the expression profiles are compared. Methods for identifying novel proteins are less advanced than for examining RNA expression profiles but rapid progress is nevertheless being made in this field. The standard method is to use two-dimensional gel electrophoresis to display proteins and then to select proteins whose abundance or mobility changes significantly. These proteins can then be cored from the gel and mass spectrometry used to obtain a signature that leads to identification of the protein in about one-third of cases. These approaches are technically quite difficult and time-consuming. Several companies are working on methods to create arrays of proteins analogous to the complementary DNA arrays used for RNA expression profiling. In theory it should be possible to generate protein arrays or chips by displaying monoclonal antibodies recognising a wide range of proteins; such approaches are currently under development.

Practical considerations
Although the pharmacogenomic approaches described here provide an obvious potential way of identifying novel genes important in a disease or in a treatment response, there are several practical difficulties that must be considered.
Firstly, it is critical to design the functional experiments carefully. For example, if a cell is to be treated with a given pro-inflammatory mediator and expression profiles are compared either at the RNA or protein level, a reasonable number of paired replicates must be performed and relevant time points examined. In practice it might be possible to reduce this to a base line and two different time points for this kind of experiment; however, even then, with an appropriate number of replicates the number of samples to be processed remains considerable. It goes without saying that expression profile data generated from poorly designed experiments are likely to be at best worthless and at worst misleading.
Secondly, a decision must be made on what to do with the novel targets identified. Initially, verification is needed and this is probably best done by using the reverse-transcriptase-mediated polymerase chain reaction in a quantitative manner.
Thirdly, the real challenge, having verified a target, is to move from knowledge of a novel gene product to knowledge of its function. As discussed above, some method of prioritising targets to be studied further is critically important at this stage. At present the use of these techniques to study respiratory disease is in its relative infancy, although in other disease areas (such as oncology) novel gene products are being identified that are likely to be important in disease pathophysiology.

Conclusion
This review has summarised how genetic approaches can be used to identify novel drug targets and, potentially, to optimise treatment response. Over the next 10 years it will become clear whether these approaches are likely to be cost effective either in the development of new drugs or in optimising prescribing drugs for individual patients with given diseases.