More than 350 cancer genes have been identified to date (see www.sanger.ac.uk/genetics/CGP) using a combination of epidemiological, genetic, and laboratory techniques. Decades of research into cancer genes have established some basic principles of tumorigenesis, including the roles of tumor suppressor genes and oncogenes, the loss and gain of which, whether through copy number changes or through inactivating or activating point mutations, can initiate or promote the evolution of cancer. Building on these principles, technologies that permit rapid sequencing or simultaneous analysis of copy number changes in hundreds of thousands of genes, coupled with advanced statistical analysis, are leading to rapid identification of new candidate cancer genes in a large spectrum of human tumors.
Many cancer genes have been identified through the use of positional cloning to characterize a recurrent chromosomal anomaly such as a translocation (t) or a chromosome gain or loss.1 Other responsible genes, principally those associated with inherited cancer syndromes, have been uncovered through a combination of epidemiological studies and linkage analysis. Finally, a subset of cancer genes has been identified in the laboratory based on homology to other known genes and/or biological characteristics. High-throughput sequencing2 and microarray3 platforms that exploit the genetic changes (such as mutations and copy number loss or gain) involved in tumorigenesis promise to accelerate advances in this field.
Chromosome Translocations and Tumorigenesis
Historically, cytogenetic observations have enabled cancer gene discovery. In 1956, the human karyotype was defined as containing 46 chromosomes, which were classified according to size and centromeric position.4 Several years later, Peter Nowell and David Hungerford identified an abnormal chromosome in patients with chronic myeloid leukemia (CML),5 which became known as the Philadelphia chromosome. With advances in chromosomal banding techniques, the Philadelphia chromosome was identified as chromosome 22 with a deletion on the long arm linked with chromosome 9 as its t partner. Genetic-mapping techniques identified the breakpoint cluster region (BCR) and Abelson murine leukemia virus homolog (ABL) as the fusion partners in the t(9;22) product.6 The chimeric bcr-abl protein drives abl kinase activity with the functional consequence of increased myeloid proliferation. In this proliferative state, additional pathogenic mutations develop, eventually leading to transformation into an acute leukemia. In the absence of treatment, the disease inevitably progresses. However, the development of tyrosine kinase inhibitors, such as imatinib mesylate, has led to remarkable improvements in survival.7
Currently, CML is defined by the presence of the t(9;22), and the t has been shown to be both necessary and sufficient for disease.7 Similar t events drive numerous hematological disorders (acute leukemias and lymphomas), sarcomas,8 and few carcinomas.9 Furthermore, it is increasingly recognized that the t defines the neoplasm, predicts the biological behavior and therapeutic response, and, as in the case of CML, even directs the type of therapy.10
In most cases, a cytogenetically visible chromosomal abnormality provides the first clue regarding a tumor’s underlying genetic abnormality. However, cancer gene discovery by this method is limited by the resolution of chromosomal banding techniques, as only large structural rearrangements or large gains and losses of chromosomal material will be detected. Cryptic translocations (those involving the telomeres or t between chromosome material with similar banding patterns) or submegabase changes require alternative detection methods, including comparative genomic hybridization (CGH) and fluorescence in situ hybridization (FISH).4
Inherited Cancer Syndromes
Many of the most common cancers, including breast and colon carcinomas, are clinically and pathologically heterogeneous and appear to be driven by a complex set of environmental and genetic interactions. Decades of clinical observation and epidemiological studies have identified a number of inherited cancer syndromes, most famously breast cancer (BRCA)-related breast and ovarian cancer syndromes, hereditary non-polyposis colorectal carcinoma (HNPCC), and familial adenomatous polyposis (FAP). While these syndromes represent only a small fraction of cancer cases in the population, recognition of the causative mutations has provided significant and general insight into mechanisms of tumorigenesis and DNA repair.
Documentation of the hereditary nature of HNPCC came in the 19th century, when Dr Aldred Warthin described a family with multiple gastrointestinal cancers. Over 100 years later, epidemiological studies confirmed that HNPCC was inherited in an autosomal dominant manner.11 Genetic studies using microsatellite markers, which are short, usually noncoding DNA sequences containing tandem repeats of two to four nucleotides that are variable within the population and can serve as markers of inheritance, showed tight linkage of the disease with markers on the short arm of chromosome 2 (2p).12 Researchers attempted to identify allelic loss on 2p (loss of heterozygosity (LOH)) as an explanation for tumorigenicity. While LOH was not seen, an interesting observation was made: the tumor cells contained multiple new microsatellite loci, which suggested impaired repair of repeated sequences. Subsequently, this process was termed microsatellite instability.
Concurrent investigations in yeast found mutated mismatch repair genes leading to microsatellite instability. These findings prompted a search for human homologs of the yeast mismatch repair genes, one of which was found on the short arm of 2p and is now known as MSH2. It is one of a handful of mismatch repair genes that can be mutated in HNPCC.13 The familial nature of FAP was also recognized for decades before the first clues into the genetic basis of the disease were identified. In 1986, an interstitial deletion on the long arm of chromosome 5 (5q) was seen in a patient with FAP. Fine mapping at this locus ultimately led to the identification of the adenomatous polyposis coli (APC) gene. APC is now recognized as a pivotal tumor suppressor gene that is mutated early in the pathogenesis of most sporadic colorectal carcinomas.14
Identification of Cancer Genes In Vitro
Ras-Raf-MEK-ERK is one of the most commonly mutated pathways in human carcinomas. The first clues to this pathway’s role in tumorigenesis came from in vitro biological assays in the early 1980s, when a genomic fragment derived from the DNA of a lung carcinoma cell line with homology to the Kirsten rat sarcoma virus was shown to transform fibroblasts.15 Activating mutations in the KRAS gene have been found in up to 25% of human tumors and are instrumental in both tumor initiation and progression.16 In addition, other members of the Ras-Raf-MEK-ERK pathway, including NRAS, HRAS, and BRAF, have been implicated in tumorigenesis via their role in mediating proliferative and survival responses to growth factors. Germline mutations in members of this signaling pathway are associated with several congenital syndromes, including Noonan syndrome (PTPN11, KRAS) and cardio-facial-cutaneous syndrome (KRAS, BRAF, MEK). Studies of these syndromes are shedding light on the developmental function of these genes, with implications for our understanding of how disordered signaling through this pathway influences tumorigenesis.17
Oncogene Copy Number Gain in Tumorigenesis
Amplification of receptor tyrosine kinase genes is another known mechanism contributing to tumorigenesis. Human epidermal growth factor receptor 2 (HER2)/neu represents the best described and most commonly clinically exploited amplified cancer gene. Originally cloned in the mid-1980s in tumor cell lines, the gene was named for its sequence homology to the HER2.18 Subsequently, it has been shown to be amplified in a subset of gastric, lung, and salivary carcinomas, but is most commonly amplified in breast carcinoma. In up to 30% of breast cancer patients, HER2/neu amplification is present and is associated with more aggressive disease. A recombinant monoclonal antibody to the HER2 protein product trastuzumab has been shown to improve outcomes in patients with HER2/neu amplification, which can be detected via immunodetection of the overexpressed protein product or via in situ hybridization for the amplified gene.19
This model has been extended to other tumors as well. Mutations in the kinase domain and/or amplifications of the epidermal growth factor receptor (EGFR) in non-small-cell lung cancer predict response to tyrosine kinase inhibitors (gefitinib, erlotinib), which have proven efficacy in a subgroup of patients, typically non-smoking women of Asian descent.20 Histopathological classification is of somewhat limited utility in identifying potential responders; rather, sequencing of the EGFR kinase domain and in situ hybridization, via either a fluorescence or chromogenic approach, promise to serve as important ancillary studies in lung cancer diagnosis.21,22
High-throughput Approaches to Cancer Gene Discovery
Currently, systematic analysis of cancer genomes by high-throughput methods is driving the discovery of important actors in tumorigenesis. Typical approaches include comparative genomic hybridization with single nucleotide polymorphism (SNP) or bacterial artificial chromosome (BAC) microarrays. These methods permit detection of deletions or duplications (‘copy number changes’) in tumors compared with normal DNA at hundreds of thousands of loci. For instance, a recent study of nearly 250 lung adenocarcinoma samples examined copy number changes at over 238,000 SNP loci. Statistical analysis revealed over 50 recurrent copy number changes, both large-scale (gain or loss of over half of a chromosome arm) and focal changes (several hundred kilobases to several megabases). Further analysis of these sites confirmed previous observations of loss of tumor suppressor genes such as cyclin-dependent kinase inhibitors (CDKN2A/CDKN2B), and gain of proto-oncogenes (KRAS, EGFR, and ERBB2) in lung adenocarcinoma. More importantly, this large-scale analysis identified new regions of frequent genomic loss and gain, with one promising, recurrent amplification occurring at 14q13.3, a locus that contains a gene encoding transcription factor critical to type II pneumocyte formation.22
Systematic sequencing of the ‘cancer genome’ can identify recurrent mutations from which new cancer genes can be inferred. While cytogenetic and comparative genomic analysis can effectively identify changes on the scale of hundreds of kilobases or larger, pathogenic point mutations cannot be detected by these methods. High-throughput sequencing is starting to elucidate the mutational signature of various human cancers. In a study that sequenced over 500 protein kinase genes encompassing 270 megabases of DNA from 210 different human cancers, numerous recurrent mutations were identified. Mutations that lead to a change in the amino acid sequence and, in particular, those that are located in the kinase domain are predicted to be the most likely to lead to gene activation and thus drive abnormal protein kinase activity. Using mathematical algorithms to predict pathogenic mutations, the authors were able to detect well-established cancer-related protein kinases, as well as new candidate cancer genes that will require biological validation but may be promising new targets for protein kinase inhibitors in cancer therapy.23
Since its inception in the mid-20th century, the field of cancer genetics has identified numerous oncogenes and tumor suppressor genes that have been implicated in tumorigenesis in many different body systems. Traditionally, the process of cancer gene identification has generally been laborious, frequently requiring decades of collaboration between clinicians, pathologists, geneticists, epidemiologists, and basic scientists. In light of improvements in cytogenetic, molecular, and computational technologies, investigators in the field today appear increasingly poised to more quickly identify candidate cancer genes and to translate these findings into promising new therapeutic targets.