微陣列數據資料相關之統計方法
Date Issued
2002
Date
2002
Author(s)
張啟仁
DOI
902321B002001
Abstract
Genomic medicine research studies in Taiwan have just begun. Thanks to the Modern laboratory tools, computer technology advances and ubiquity of the internet offer unprecedented opportunity for scientists to gain access to, share, and analyze critical data and information stored in
databases over the cyberspace. Scientific
discovery can be expedited and many wastefully and costly experiments can be avoided if the vast information could be stored, shared, analyzed and opened to the research scientists in any clinical research institute. One of the successful collaboration examples among physicians, laboratory scientists, and biostatisticians has been established within National Taiwan University Hospital Research Group. They have successfully implemented their own research topics in collaboration with scientist in Microarray laboratory and Bioinformatics and Biostatistics laboratory.
Recent development of Microarray technology has enabled research scientists approaching their own area into a new era. With the collaboration from scientists in generating Microarray data, investigators can look into the research problems from a vast point of view, i.e. from a large data generated by array machine. However, due to the fast growing technique and astonishing
data output, data analysis of this new Bioinformatic became an important issue in
biomedical research.
Understanding what and how the Bioinformatics can provide has given clinical physicians in medical center a starting point to reconsider an infrastructure of Bioinformatics facility as a research service resource. A Microarray facility center has been established under the direction of Dr.
Jeremy Chen in which research services and education in using Microarray machine and
technique were rapidly provided within NTUH. In the meantime, statistical data analysis support for Microarray data in Bioinformatic area has also been provided in NTUH research campus. A NSC supported grant “Bioinformatics Research Services Facility Using Microarray Data
(NSC89-2316-B-002-035)” has provided us a good starting point to setup the connection among Microarray data analysts, Microarray data generators, and research investigators. In
adjunction to the previous year support from
NSC, another NSC research project “Statistical Mining Methods for Information Generated from Microarray (NSC90-2321-B002-001)” in dealing with
the statistical methods is awarded and emphasize in discovering the data filtering method.
Huge data sets can easily be generated from the fast speed machine, and thus the demands of data collection, clearing, selection, and data management in the database is therefore strongly needed. However, the ability to tackle such problems can only be made and solved by limited
attached programs from the array management software such as Genecluster from MIT or Spotfire. However, even with a good starting point of the statistical support to the data generated from Microarray research. There still exist some potential problems in handling the data from the early data management stage to the later statistical modeling and analysis stage. Some interested issues in dealing with the data from
Microarray are 1) data filtering; 2) cluster
analysis; and 3) discriminant classification
analysis. Our goals are to study, develop
practical and advanced statistical data mining
methods for Microarray data, especially when
the data generated involved lots of
uncertainty and thus the data filtering become
the major issue in this research proposal.
Formal statistical consideration of the
validity of huge data set has been considered
from many automatically software performed
by individual PIs and their assistants. We
have thus experienced the data generated from the Microarray machines have it own uncertainty. How to handle this error and how to perform the data correction become a good issue and have been discussed by many experts such as Lee, et al., and DeRisi et al.
In a study to identify the possible clusters of
genes, one need to eliminate the false expressed genes, it can be solved using both cell populations such as normal and abnormal genes in the experiment. This can be solved using two-stage procedure, first to use the normal vs. normal genes in expression to detect which genes are sensitive to the noise or bias of the experiment and thus to identify the “false expressed” genes. Secondly, after eliminating these “false expressed” genes, one can plot the normal vs. tumor genes to identify the influential genes. We propose using ACE (Alternating Conditional Expectation Transformation) to tackle the abnormality of the data generated from Microarray. Some discussion of this method is presented in this report. This report is under the guidance for report writing supported by the Grant of National Science Council.
Subjects
Research Services
Microarray
Biostatistics
Bioinformatics
ACE
Publisher
臺北市:國立台灣大學醫學院臨床醫學研究所
Type
journal article
File(s)![Thumbnail Image]()
Loading...
Name
902321B002001.pdf
Size
402.6 KB
Format
Adobe PDF
Checksum
(MD5):d63e785fa1fba9bac124779d71b97681
