Alignment Algorithm for Comprehensive Two-dimensional Gas Chromatography-Mass Spectrometry
Date Issued
2011
Date
2011
Author(s)
Tian, Tze-Feng
Abstract
Three works are included in this thesis including 1) an algorithm for Comprehensive two-dimensional gas chromatography mass spectrometry alignment, 2) 3Omics: a web based systems biology visualization tool for integrating human transcriptomic, proteomic and metabolomic data, and 3) HMO: a tool for understanding the human metabolome.
A novel peak alignment algorithm, 2DGCMS-aligner, has been developed for two-dimensional gas chromatography time-of-flight mass spectrometry (GCxGC/TOF-MS) data. 2DGCMS-aligner uses the netCDF data generated from the instrument as input directly. It detects blobs, clusters of pixels that are brighter or darker than their surround in a chromatogram, of each GCxGC/TOF-MS raw data to generate blob tables instead of peak tables to perform alignment. 2DGCMS-aligner correlates the blobs with Euclidean distance of the first- and second retention times in the blob tables and the mass spectra with Pearson’s correlation coefficient. This alignment algorithm in 2DGCMS-aligner can be applied to GCxGC-MS data generated by either consistent or inconsistent instrument environment to adjust retention time shifts along both chromatographic dimensions caused by uncontrollable fluctuations in temperature and pressure, matrix effects and stationary phase degradation. 2DGCMS-aligner also includes an option to correct baseline on raw data directly. The performance of 2DGCMS-aligner peak alignment algorithm was compared and demonstrated with three existing alignment methods on the two sets of GCxGC-MS data sets acquired in different experiment conditions and a mixture of standard metabolites.
3Omics: a web based systems biology visualization tool for integrating human transcriptomic, proteomic and metabolomic data was developed to visualize and rapidly integrate multiple inter- or intra-transcriptomic, proteomic, and metabolomic human data. A biochemical cascade is generated through consolidation of transcript, protein, and metabolite data and implements via the application of five commonly used analyses of correlation network, co-expression, phenotyping, KEGG pathway enrichment, and GO enrichment. 3Omics incorporates the advantages and operations of existing software into a single platform, therefore simplifying the data analysis procedure and enabling the user to perform a one-click integrated analysis for free. Visualization and analysis results are downloadable for further user customization and analysis. The 3Omics software can be freely accessed at http://cmdd.csie.ntu.edu.tw/~3omics.
Last part of this thesis work is the construction of Human Metabolome Ontology (HMO). Final step in current metabolomics studies involves assessment and biological interpretation of metabolome. It often requires tedious manual collections of literature or linking information scattered in Gene Ontology, BRENDA, KEGG Brite, KEGG Pathway, Human Metabolome Database, OMIM and so on. We developed the HMO to facilitate integration of biological functions, and chemical classification of metabolome and comprehensive understanding of metabolome and its target interactions as the common language and knowledge framework allowing further computational analysis. HMO consists of three independent ontologies: biological functions, chemical taxonomies and metabolome targets. It provides a comprehensive metabolome centered resource that enables the sharing and reuse of the know-ledge across domains of ontologies.
A novel peak alignment algorithm, 2DGCMS-aligner, has been developed for two-dimensional gas chromatography time-of-flight mass spectrometry (GCxGC/TOF-MS) data. 2DGCMS-aligner uses the netCDF data generated from the instrument as input directly. It detects blobs, clusters of pixels that are brighter or darker than their surround in a chromatogram, of each GCxGC/TOF-MS raw data to generate blob tables instead of peak tables to perform alignment. 2DGCMS-aligner correlates the blobs with Euclidean distance of the first- and second retention times in the blob tables and the mass spectra with Pearson’s correlation coefficient. This alignment algorithm in 2DGCMS-aligner can be applied to GCxGC-MS data generated by either consistent or inconsistent instrument environment to adjust retention time shifts along both chromatographic dimensions caused by uncontrollable fluctuations in temperature and pressure, matrix effects and stationary phase degradation. 2DGCMS-aligner also includes an option to correct baseline on raw data directly. The performance of 2DGCMS-aligner peak alignment algorithm was compared and demonstrated with three existing alignment methods on the two sets of GCxGC-MS data sets acquired in different experiment conditions and a mixture of standard metabolites.
3Omics: a web based systems biology visualization tool for integrating human transcriptomic, proteomic and metabolomic data was developed to visualize and rapidly integrate multiple inter- or intra-transcriptomic, proteomic, and metabolomic human data. A biochemical cascade is generated through consolidation of transcript, protein, and metabolite data and implements via the application of five commonly used analyses of correlation network, co-expression, phenotyping, KEGG pathway enrichment, and GO enrichment. 3Omics incorporates the advantages and operations of existing software into a single platform, therefore simplifying the data analysis procedure and enabling the user to perform a one-click integrated analysis for free. Visualization and analysis results are downloadable for further user customization and analysis. The 3Omics software can be freely accessed at http://cmdd.csie.ntu.edu.tw/~3omics.
Last part of this thesis work is the construction of Human Metabolome Ontology (HMO). Final step in current metabolomics studies involves assessment and biological interpretation of metabolome. It often requires tedious manual collections of literature or linking information scattered in Gene Ontology, BRENDA, KEGG Brite, KEGG Pathway, Human Metabolome Database, OMIM and so on. We developed the HMO to facilitate integration of biological functions, and chemical classification of metabolome and comprehensive understanding of metabolome and its target interactions as the common language and knowledge framework allowing further computational analysis. HMO consists of three independent ontologies: biological functions, chemical taxonomies and metabolome targets. It provides a comprehensive metabolome centered resource that enables the sharing and reuse of the know-ledge across domains of ontologies.
Subjects
2DGC alignment
Systems Biology Visualization
Human Metabolome Ontology
Metabolomics
Type
thesis
File(s)
Loading...
Name
ntu-100-R98922152-1.pdf
Size
23.32 KB
Format
Adobe PDF
Checksum
(MD5):de7504ac184601d0bc8f36fd0c9b8ca4