Predicting essential genes based on network and sequence analysis
Resource
2007 symposium of bioinformatics and systems biology in Taiwan, Chiayi, Taiwan, September 4-5
Journal
Molecular BioSystems
Journal Volume
5
Journal Issue
12
Pages
1672–1678
Date Issued
2007
Date
2007
Author(s)
Abstract
Essential genes are indispensable to the viability of an organism. Identification and analysis of essential genes is key to understanding the systems level organization of living cells. On the other hand, the ability to predict these genes in pathogens is of great importance for directed drug development. Global analysis of protein interaction networks provides an effective way to elucidate the relationships between genes. It has been found that essential genes tend to be highly connected and generally have more interactions than nonessential ones. With recent large-scale identifications of essential genes and protein-protein interactions in Saccharomyces cerevisiae and Escherichia coli, we have systematically investigated the topological properties of essential and nonessential genes in the protein-protein interaction networks. Essential genes tend to play topologically more important roles in protein interaction networks. Many topological features were found to be statistically discriminative between essential and nonessential genes. In addition, we have also examined sequence properties such as open reading frame length, strand, and phyletic retention for their association with the gene essentiality. Employing the topological features in the protein interaction network and the sequence properties, we have built a machine learning classifier capable of predicting essential genes. Computational prediction of essential genes circumvents expensive and difficult experimental screens and will help antimicrobial drug development. © 2009 The Royal Society of Chemistry.
Other Subjects
article; bacterial genome; biological model; DNA sequence; Escherichia coli; essential gene; fungal genome; gene regulatory network; genetics; genomics; methodology; receiver operating characteristic; Saccharomyces cerevisiae; statistical model; Escherichia coli; Gene Regulatory Networks; Genes, Essential; Genome, Bacterial; Genome, Fungal; Genomics; Models, Genetic; Models, Statistical; ROC Curve; Saccharomyces cerevisiae; Sequence Analysis, DNA; Escherichia coli; Saccharomyces cerevisiae
Publisher
The Royal Society of Chemistry
Type
journal article
