Abstract
摘要:雙染色DNA微陣列越來越普遍用於生物實驗以平行探討數以萬計的基因表現量。關於分析此類型大量數據的統計方法近來迅速發展,然而攸關實驗可靠性、實驗成本及數據品質的實驗設計卻很少被探討。Kerr (2001) 首先討論雙染色DNA微陣列實驗統計設計的相關議題,並且提出A-最適的處理組比較實驗(treatments comparative experiments)設計。Young (2002)更進一步提出一些關於雙染色DNA微陣列實驗設計的重要議題。然而他們的研究並未有系統地將此類型實驗的一個主要干擾因子(nuisance factor)即染劑效應(dye effect)考慮於模型之中。最近Chai, Liao and Tsai (2006)同時考量不同晶片(slide effect)及不同染劑兩種主要變異而提出一個對數比模型(log-ratio model),並根據該模型建構A-最適(A-optimal)的處理組比較實驗設計。他們並且解析證明藉由他們所提出的演譯法(algorithm)找出的幾組相等重複的設計(equireplicate)為A-最適設計。在另外一篇論文, Tsai, Liao and Chai (2006) 更進一步將log-ratio模型推廣,令機差項(error term)包含某種可能存在的相關結構(correlation structure),用來說明當實驗的mRNA樣本來自於技術性重複(technical replicates)的情形。 該研究主要在探討對照組-處理組比較實驗(test-control comparative experiments)的A-最適設計,藉由他們所提出的方法所建構的設計,對於技術性重複所可能產生的相關均具有穩健性(robustness)。而且他們探討當晶片有ㄧ片或兩片缺失時,他們所得到的A-最適設計對於晶片缺失的穩健性。
本計劃將更深入有系統的探討晶片缺失時實驗設計之穩健性議題,當實驗過程可能因某些因素造成晶片損壞或實驗
Abstract: cDNA or oligonucleotide-based microarray experiments now make it possible to simultaneously monitor the mRNA expression levels of many thousands of genes between different biological states of cell populations. Statistical data analysis for such large-scale and complex data has evolved very fast in recent years. However, the design issues related to the cost of experiment, the reliability of data, etc, remain relatively unexplored. Kerr and Churchill (2001) address the optimal design issues on two-color microarray experiments and establish a connection between such experiments with the classical block designs of size two. Yang and Speed (2002) discuss some more important design issues in the two-color microarray experiments. As discussed in these two papers, the most important design issue concerning two-color microarray experiments is to determine which mRNA samples are to be labeled with which fluorescent dye; and which are to be hybridized together on the same slide under various scientific and physical constraints.
Most recently, Chai, Liao and Tsai (2006) tackle this problem via a rigorous optimum design construction approach. They propose a linear normalization model to incorporate both the variation between distinct dyes and that between distinct slides. Then they generate a series of designs of practical sizes for the treatments comparative experiments based on a heuristic algorithm. Furthermore, Tsai, Liao and Chai (2006) adapt the normalization model presented by Chai et al. (2006) to take the correlation due to technical replication into account. In this article, they specifically focus on the test-control comparative experiments rather than on the treatments comparative experiments. Similarly, a series of practical designs is also reported to assistant biologists designing their experiments.
In this research proposal, we turn our attention to the design issue of robustness when some slides are missing. It often happens that the data regarding some slides become unavailable in a two-color microarray experiment. Thus the common reference type designs with some sense of robustness are strongly recommended in such experiments (Simon, Radmacher and Dobbin, 2002). However, the efficiency of the common reference type designs is usually unsatisfactory due to including a common mRNA sample of no interest. Hence, the designs have not only high efficiency but also robustness against missing slides that are in demand. In this study, we may evaluate the designs based on the following two criteria: (1) the connectedness of the residual designs and (2) the relative efficiency of the residual design to the original design. We hope to analytically investigate the properties of two-color microarray designs published in the literature based on t
Keyword(s)
生物資訊
生物晶片
基因表現量
缺失資料
最適設計
區集設計。
Bioinformatics
biochip
gene expression level
missing data
optimal design
block design.