臺灣大學: 生物產業機電工程學研究所陳倩瑜趙健合Chao, Chien-HoChien-HoChao2013-03-212018-07-102013-03-212018-07-102011http://ntur.lib.ntu.edu.tw//handle/246246/247664蛋白質是維持生命的重要物質,在生物體內,蛋白質與DNA之結合牽引著許多生化反應與活動,如轉錄因子與特定DNA之結合,可開啟特定基因之轉錄活動。因此長久以來,蛋白質與DNA之間的互動一直是生物學家們所爭相研究的對象, 近年來,由於電腦科技與計算能力之發展與進步,生物學家與統計學家們利用電腦程式之計算與彙整能力,逐步輔助傳統生物實驗之研究,而其中,預測蛋白質與其他生物單元如蛋白質、小分子、甚至DNA之互動之親和力一直是備受關注的主題,近年來也有許多針對此議題之研究,開發許多不同種類的親和力預測之評分函數,其中以機器學習演算法為基礎之評分函式,近幾年在預測蛋白質與小分子結合之親和力這個問題上,皆得到不錯的成效。 本篇論文嘗試以機器學習演算法為基礎,設計能預測蛋白質與DNA結合親和力之評分函數,此研究篩選高品質的蛋白質與DNA複合物結構與實驗所得之親和力資訊作為本篇論文之材料來源,建構以知識庫搭配機器學習演算法為基礎之評分函數。實驗結果顯示,使用隨機森林為基礎之分類方法,在預測蛋白質與DNA結合親和力之問題上,亦可得到良好的預測結果;本論文同時也引入不同種類的特徵擷取方式,並討論其對預測結果之影響,期待能對生物巨分子之間結合親和力之評分函數開發等研究議題有所貢獻。Proteins and DNA play important roles to maintaining life in living cells. The binding of protein to specific DNA sequences is the beginning of lots of bio-activities. For instance, the binding of regulatory sites of DNA by transcription factors, which are a kind of proteins that trigger transcription of a particular gene, initiates the transcription process. Research on this issue could facilitate the studies of gene regulation and regulatory networks. For these reasons, the study of interactions between protein and DNA has attracted much attention for a long time. Recently, with the advances of computer technology and algorithm development, developing computational methods to predict binding affinity of protein-protein, protein-ligand and even protein-DNA interactions has been largely considered recently. Some of the scoring functions for predicting protein-ligand are shown to perform well on this challenge. In this thesis, a machine learning-based scoring function was developed to predict the binding affinity of protein-DNA interactions. For this purpose, a high-quality dataset containing the information of binding affinity associated with a protein-DNA complex was collected from PDBbind. The performance of the proposed method was compared with existing scoring functions, and it is concluded that the proposed machine learning-based scoring function perfrom well in predicting the binding affinities of protein-DNA complexes and can benefit future studies on this problem.4812277 bytesapplication/pdfen-US蛋白質與DNA交互作用評分函數隨機森林親和力預測protein-DNA interactionscoring functionrandom forestbinding affinity prediction建立以機器學習演算法為基礎之評分函數預測蛋白質與DNA結合之親和力Predicting Binding Affinity of Protein-DNA Interactions Using Machine Learning-based Scoring Functionsthesishttp://ntur.lib.ntu.edu.tw/bitstream/246246/247664/1/ntu-100-R98631042-1.pdf