高成炎臺灣大學:資訊工程學研究所趙東方Chao, Tung-FangTung-FangChao2007-11-262018-07-052007-11-262018-07-052004http://ntur.lib.ntu.edu.tw//handle/246246/54051動機:雙硫鍵在蛋白質結構中扮演了一個重要的角色,因為雙硫鍵屬於共價鍵,鍵結能也比一般的氫鍵來的強,因此,對蛋白質的結構,有著很大的影響。所以如果能正確的預測雙硫鍵的連結狀態,將有助於蛋白質立體結構的預測。 結果:在本篇論文中,我們將介紹一個簡單的方法來預測雙硫鍵的連結狀態,我們假設一個簡單的規則,「相同的雙硫鍵連結狀態,與在蛋白質序列中,兩兩Cystein 之間的距離有相互對映的關係 」,根據這個假設,我們從蛋白質資料庫中,產生了一個前置檔,前置檔的內容包括了雙硫鍵的連結狀態與兩兩Cystein之間在蛋白質序列的距離,並以此前置檔來預測雙硫鍵的連結狀態。我們做了二組試驗,在第一組試驗中,我們採用前人使用的測試資料來做雙硫鍵的預測,在前置檔與測試蛋白質的序列相似度小於等於30%的情況下,Qp值(即預測準確度)等於0.49,優於其他演算法的方法(0.44);在第二組試驗中,我們以舊有資料庫中的資料,來預測目前新增的資料庫中蛋白質雙硫鍵的連結狀況,根據實驗的結果,在前置檔與測試蛋白質的序列相似度小於30%的情況下,Qp值也有0.53,因此,我們認為我們的基本假設是有某種程度的可信度,並且可以成左瑰野峏鬋馫蜂銂熙s結狀態的預測。Motivation. Disulfide bonds play an important role in protein folding. The exact prediction of disulfide connectivity can reduce the search space in protein structure prediction. Therefore, the exact prediction of the disulfide connectivity may help the 3D structure prediction. Result. In this paper, we proposed a simple rule to define a disulfide connectivity pattern. “The same disulfide connectivity patterns have the same distance between two cysteines in protein sequences.” We used this rule to create a disulfide profile, and then used a minimum distance scoring function to predict disulfide connectivity. We reported the experimental results in two test sets. The first test set is to compare our method with other algorithms. The second test set is to test the performance for unknown protein. In the first experiment, the value of Qp is equal to 0.49 for non-redundant proteins in test set with less than 30% sequence identity. This result is better than the other algorithms (Qp=0.44). In the second experiment, the value of Qp is equal to 0.53 for non-redundant proteins in test set with less than 30% sequence identity. Therefore, we believe that using our disulfide profile, we can achieve high accuracy in predicting unknown protein. The method proposed here is relatively simple and can generate more accurate results than conventional methods. It may also be combined with other algorithms for further improvements in disulfide connectivity prediction.Abstract (in Chinese) iii Abstract iv Acknowledgements (in Chinese) v Table of Contents vii List of Tables ix List of Figures xii Chapter 1 - Introduction 1 1.1 Motivations and Purposes 1 1.2 Problem Defined 1 1.3 Related Works 3 Chapter 2 - System and Method 4 2.1 System Overview 4 2.2 Basic Assumption 5 2.3 Defining the content for Profile 7 2.4 Scoring Function 7 2.5 Search Algorithm 8 2.6 Sequence Identity 8 2.7 Cross Validation 8 Chapter 3 - Implementation 9 3.1 The Protein Test Set 9 3.2 The protein profile set 11 3.3 Performance Measures 13 Chapter 4 - Results and Discussion 14 4.1 Result for Experiment 1 14 4.2 Result for Experiment 2 23 4.3 Distance analysis for experiment 1 31 4.4 Distance analysis for experiment 2 39 4.5 Distance Relationship 47 Chapter 5 - Conclusions and Future Work 49 5.1 Conclusions 49 5.2 Future Work 49 References 50788186 bytesapplication/pdfen-US雙硫鍵cysteine separation profiledisulfide connectivity雙硫鍵連結預測-使用序列距離之前置檔Disulfide Connectivity Prediction using Sequence Distance Profilethesishttp://ntur.lib.ntu.edu.tw/bitstream/246246/54051/1/ntu-93-P90922001-1.pdf