高成炎臺灣大學:資訊工程學研究所張嘉文Chang, Chia-WenChia-WenChang2007-11-262018-07-052007-11-262018-07-052004http://ntur.lib.ntu.edu.tw//handle/246246/53828抗輻射奇異球菌 (Deinococcus radiodurans R1) 是一株相當奇特的格蘭氏陽性中溫球菌,其最大的特色在於對環境中的各種逆境具極高的抗性,特別是對諸多DNA傷害因子,如游離輻射、紫外線、過氧化氫等,皆具有極佳的抗性。自西元1999年抗輻射奇異球菌的全基因解碼完成公佈後,目前已知整個細菌的完整基因體帶有3,195個基因,分佈在2個染色體及一大一小兩個質體上,在這些已知的資訊中,我們發現若能利用D. radiodurans R1 的全基因序列來研究此細菌對抗各種逆境機制,特別是抗輻射與DNA修補的機轉,是一個相當可行的研究方式,若能對其抗輻射的機制有較深入的了解,未來甚至可以進一步利用其相關酵素於醫學研究或具輻射線的廢棄物處理上。 到目前為止,科學家對於 D. radiodurans R1 的抗輻射機制所知依然有限,對於自然狀態下此菌為何會對輻射線具高抗性的原因並不清楚,即使利用全基因體分析方法來研究,對於在抗輻射中的基因與蛋白質所扮演的角色並不十分清楚,但可以推測的是,D. radiodurans R1要能在高能量輻射線下生存,各種生理生化反應中所必需的蛋白質首先要不被這些高能量輻射線所破壞,才有可能在菌體遭逢危機時,提供足夠的保護與修補能力來幫助細菌存活。 在本研究中,嘗試以生物資訊學的角度,結合資料庫與基因體學的方式來探索 D. radiodurans R1 可能的抗輻射機制。研究的方式是以目前已完成的兩個全基因資料庫,包括抗輻射奇異球菌D. radiodurans R1 及大腸桿菌 E. coli K-12 的蛋白質資料庫為研究材料,利用RDBMS (Relation Database Management System)的方式分析兩者蛋白質中胺基酸的差異性,利用四個步驟,包括一、將蛋白質資料庫輸入(“PROTEIN LOADER” loaded the FASTA format proteins into the database),二、半自動蛋白質資料分配與比對 (SADPC: Semi-Automatic Distributed Protein Comparison),三、顯著蛋白質比對結果篩選 (SPVAW: Significant Protein Viewer and Writer),四、最佳比對結果分析與輸出 (MSPS: Most Significant Protein Search, Analysis and Output tool),利用生物資訊學的方式來比較找出在抗輻射奇異球菌D. radiodurans R1 及大腸桿菌 E. coli K-12兩者數千個基因中,在胺基酸組成、結構相似或性質類似,卻存在抗輻射能力差異的的蛋白質,來幫助研究細菌可能的抗輻射機制。 以初步的分析結果看來,在D. radiodurans R1 所有3,195個基因中,本研究中所撰寫的程式已篩選出997個符合條件的蛋白質,當中若扣除重複或不易分析的蛋白質,可望將數目降低至約300個。未來將選殖這些蛋白質的基因加以重組及表現,取得大量純化的目標蛋白質,並實際進行蛋白質輻射抗性測試,最終目標是期望能從蛋白質的角度解釋微生物抗輻射的原因。The Bacterium Deinococcus radiodurans R1 was extremely resistant to ionizing radiation, UV light, hydrogen peroxide, and numerous other agents that damage DNA as well as being highly resistant to desiccation. It is clear that the D. radiodurans R1 whole genome carries 3,195 predicted genes, consists of two chromosomes, one megaplasmid, and one plasmid. This combination of factors has positioned D. radiodurans R1 as a promising candidate for the study of mechanisms of DNA damage and repair, as well as its exploitation for practical purposes such as cleanup and stabilization of radioactive waste sites. Radiation resistance of D. radiodurans R1 seems very complex and is determined collectively by some features revealed by genome analysis, as well as by many more subtle structural peculiarities of proteins and DNA that are not readily inferred from the comparative sequences analysis. The fundamental questions underlying the extreme resistance phenotype of D. radiodurans R1 remain unanswered. In this study, two protein databases including the radiation resistant bacterium: Deinococcus radiodurans R1 and mesophilic radiation sensitive bacterium: Escherichia coli K-12, were applied to whole protein sequences comparative analysis. Several computational tools developed from RDBMS (Relation Database Management System), were applied to this study. The analysis was consisted by several programs in four steps, including Step 1: “PROTEIN LOADER” loaded the FASTA format proteins into the database; Step 2: SADPC (Semi-Automatic Distributed Protein Comparison); Step 3: SPVAW (Significant Protein Viewer and Writer) and Step 4: MSPS (Most Significant Protein Search, Analysis and Output tool). From the protein database comparison results, our target is to find the candidate proteins in D. radiodurans R1 and E. coli K-12 that are with similar compositions, 3-dimensional structure or characteristics, but with different resistance to radiation. The results in this study indicated that programs developed from this study were helpful for biologists to analyze their data. More works such as artificial selection of candidate proteins were in process. In preliminary results, 997 candidate proteins were selected from thousands genes of D. radiodurans R1 and E. coli K-12. Duplicated and hypothetical proteins will be reduced and the target numbers of candidate proteins were under 300. In vitro radiation resistant tests of proteins will be preformed after gene cloning, expression and purification. The final goal is to figure out the radiation resistant mechanisms of microbes from the view of proteins.中文摘要 ---------------------------------------------------------------------- IV 英文摘要 ---------------------------------------------------------------------- VI Chapter 1 Introduction -------------------------------------------- 1 Chapter 2 System and method -------------------------------------------- 14 I. Genome sequences database II. Computation process Step 1: “PROTEIN LOADER” loaded the FASTA format proteins into the database ---------------------------- 15 Step 2: SADPC (Semi-Automatic Distributed Protein comparison) ---------------------------- 16 Step 3: SPVAW (Significant Protein Viewer and Writer) -- 18 Step 4: MSPS (Most Significant Protein Search, Analysis and Output tool) ---------------------------- 18 Chapter 3 Results and Discussion -------------------------------------- 30 I. Protein database comparison and sample size reducing process II. The preliminary comparison results Chapter 4 Conclusion -------------------------------------------- 43 Chapter 5 Future works -------------------------------------------- 44 I. Biological and biochemical applications II. Computer science applications – SADPC III. Computer science applications – modify the kernel of SADPC IV. Computer science applications – User interface Chapter 6 Reference -------------------------------------------- 481815578 bytesapplication/pdfen-US胺基酸資料庫蛋白質抗輻射奇異球菌proteinDeinococcus radiodurans R1Amino Aciddatabase以蛋白質資料庫之比對預測抗輻射奇異球菌之抗輻射機制Exploring of Radiation Resistance from Bacterium Deinococcus radiodurans R1 by Amino Acid Compositions of Proteins: View from Comparative Whole Protein Sequencesthesishttp://ntur.lib.ntu.edu.tw/bitstream/246246/53828/1/ntu-93-P90922009-1.pdf