周瑞仁臺灣大學:生物產業機電工程學研究所楊宜中Yang, Yi-ChungYi-ChungYang2007-11-262018-07-102007-11-262018-07-102004http://ntur.lib.ntu.edu.tw//handle/246246/52878本研究發展一套相關性分析法(Correlation Approach)進行序列比對(Sequence Alignment)。這種方法可將序列比對拆成相關性比對、搜尋與計分三個階段處理,因此分數矩陣(Scoring Matrix)僅在最後階段加總計分時併入考量即可,它的變動並不影響最佳比對序列(Optimal Sequence)的搜尋;且保證能找到所有的最佳比對序列。相關性分析法係利用反互斥或運算(Exclusive NOR)進行比對序列的運算,比對字元相符(Match)則輸出1,不相符(Mismatch)則輸出0,再從0及1所組成的相關性矩陣(Correlation Matrix)建構累加相關性矩陣(Cumulative Correlation Matrix) ,而從中搜尋出所有最佳比對序列的路徑(Path)並找到相應的最佳比對序列,最後加入分數矩陣計算最佳比對序列的總分。因為前兩步驟都不需要考量分數矩陣,所以除了最後計分階段之外相關性分析法並不受分數矩陣的變動而影響比對結果。但若採用動態規劃(Dynamic Programming)做雙序列比對,則分數矩陣的決定會影響動態規劃的比對結果,一旦分數矩陣變動則必須重新所有比對過程與計算;且不能保證找到全部的最佳序列比對。實驗結果顯示,所提出的方法不但能夠就不同的分數矩陣進行快速運算,提供完整最佳序列比對的資訊,而且較直覺易懂。This study develops a novel method for sequence alignment based on correlation appoach. This approach can be broken down into three stages – correlation, searching and scoring which are decoupling processes. A scoring process is incorporated into the alignment process in the final stage. Therefore, the variation of scoring matrix doesn’t affect the correlation and searching procedures. Moreover, all feasible optimal sequences are ensured to be obtained by using the developed approach. The correlation approach for sequence alignment is mainly based on exclusive NOR operation. If two symbols are matching, output 1as a result, if mismatch, output 0. Subsequently, all feasible paths of optimal sequences are searched and their corresponding sequences are found from the correlation matrix and cumulative correlation matrix. Finally, the score of optimal the sequence is calculated with the specified scoring matrix. Due to the 1st and 2nd stages not considering scoring matrix, the variation of scoring matrix couldn’t affect the aligned results with the correlation approach. Yet, dynamic programming, which is widely used in sequence alignment, often suffers from the recalculation of the whole alignment process while scoring matrices change. Therefore, different optimal sequences are obtained. Compared with dynamic programming, the developed alignment approach with three stages of decoupling processes is proved to provide complete sets of optimal sequences and a more efficient and comprehensive way for sequence alignment.誌謝………………………………………………………………………I 中文摘要……………….………………………………………………II 英文摘要………………………………………………………………III 圖目錄………………….………………………………………………VI 表目錄………………….………………………………………………IX 第一章 前言……………………………………………………………1 第二章 文獻探討………………………………………………………4 2.1 動態規劃(Dynamic Programming).................6 2.1.1 Needleman-Wunsch Method(Global)……………6 2.1.2 Smith-Waterman Method(Local)………………11 2.2 散列編碼法……………………………………………….14 2.3 序列資料庫搜尋………………………………………….16 2.3.1 FASTA(EBI)………………………………………17 2.3.2 BLAST(NCBI)…………………………………….20 第三章 材料與方法………………………………………………….25 3.1 反互斥或運算…………………………………………….25 3.2 相關性矩陣的路徑搜尋………………………………….26 3.2.1 路徑搜尋演算法………………………………….28 3.2.2 區域性序列比對連接成全域性序列比對……….29 3.2.3 最佳比對序列加權計分………………………….34 3.3 不相符的無間隙比對與間隙比對……………………….35 3.3.1 不相符的無間隙比對………………………………36 3.3.2 不相符的間隙比對…………………………………38 第四章 實驗結果與討論…………………………………………….41 4.1 以動態規劃法進行序列比對…………………………….41 4.1.1 產生唯一的回溯路徑………………………………41 4.1.2 產生不同的回溯路徑………………………………44 4.2 以相關性分析法進行序列比對………………………….47 4.2.1 以相關分析法解決DP僅能產生唯一路徑的問題…47 4.2.2 以相關分析法解決DP可能產生不同路徑的問題…51 4.3 以軟體驗證實驗結果…………………………………….55 4.3.1 利用EMBOSS進行序列比對…………………………55 4.3.2 利用Correlation進行序列比對………………….56 第五章 結論………………………………………………………….60 參考文獻……………………………………………………………….62852647 bytesapplication/pdfen-US分數矩陣動態規劃相關性最佳比對序列Optimal alignment sequenceScoring matrixDynamic programmingCorrelation利用反互斥或相關性分析進行序列比對Sequence Alignment with XNOR Correlation Analysisthesishttp://ntur.lib.ntu.edu.tw/bitstream/246246/52878/1/ntu-93-R91631023-1.pdf