Gene Set-based Approaches for Analyzing Copy Number Alterations and Drug Combinations in Cancer
Date Issued
2015
Date
2015
Author(s)
Hsu, Yu-Ching
Abstract
With the advances in high-throughput microarray and next-generation sequencing technologies, various statistical methods and mathematical models have been developed to comprehensively explore complex cancer genomes. Recently, a knowledge-based gene set analysis was proposed and successfully carried out remarkable findings from different layers of molecular data, such as gene expression and genomic alterations. Due to its power in detecting functional changes resulted from both significantly and modestly changed genes, gene set analysis provides biological insights into cancer genomes. In this study, we proposed two systematic analysis methods based on the concept of gene set analysis, for analyzing copy number alterations (CNAs) and predicting combinatorial drug therapy. CNAs, defined as genomic mutations more than a thousand base pairs, affect a large number of genes simultaneously and play an essential role in tumorigenesis. In the first part of this study, we sought to systematically explore its influences on biological functions and association with patient survival. We devised an algorithm, called Gene Set analysis for Copy number Alterations (GSCA), and analyzed CNA (N = 1,045) and gene expression (N = 529) datasets of breast tumors downloaded from The Cancer Genome Atlas (TCGA). Clinical information of these samples and the identified CNA-affected gene sets were also incorporated. Thirty-five and ten gene sets showed significant enrichment in profiles of copy number gains and losses, respectively. Genes within 44 of the 45 gene sets (98%) exhibited concordant expressional changes with the status of copy numbers. On the other hand, survival analysis revealed the prognostic role of several CNA-affected gene sets. Taken together, the result showed that CNAs can disturb biological functions by altering gene expression, and thus affect the clinical outcomes of patients. Due to the complexity of cancer genome, on the other hand, patients suffered from cancer relapse caused by the occurrence of resistance to individual antitumor drugs. The development of combinatorial drug therapy is of great need since single drugs alone is not able to overcome drug resistance resulting from the continuous activation of drug target or its downstream signaling pathway. However, due to the large amount of FDA- approved drugs, it is impractical to experimentally test every possible drug pairs. We proposed a computational prediction method for drug synergy to address this issue. We hypothesized that drug pairs achieve synergy by targeting similar biological functions and similar genes in a function and validate our devised methods by the datasets provided by the DREAM consortium. The results showed that the devised prediction scores have high performance. The co-gene/GS score even outperformed the methods proposed during the DREAM challenge. In addition, the results also showed that the best performing method devised using the concept of gene set analysis is capable of investigating the underlying mechanism by which drug pairs achieve synergy. We further applied the methods to a larger dataset, the connectivity map dataset, to explore a broader range of synergistic drug combinations. Overall, in the present study we proposed two gene set-based approaches to systematically study the biological roles and clinical significance of CNAs and to predict drug synergy in breast cancer. We demonstrated that these methods can not only identify biologically well-tested results, but also reveal abundant novel candidates for future biological investigations. The findings are expected to enhance our understanding in tumorigenesis and facilitate the development of combinatorial drug therapy for cancers.
Subjects
gene set analysis
copy number alteration
drug combination
breast cancer
microarray
SDGs
Type
thesis
File(s)![Thumbnail Image]()
Loading...
Name
ntu-104-R02945036-1.pdf
Size
23.32 KB
Format
Adobe PDF
Checksum
(MD5):5d888a3a406d9ca2a1bb9fc21ccc51f5
