Applying Machine Learning on Prediction of RNA-Binding Residues in Proteins
Other Title
應用機器學習方法預測核糖核酸與蛋白質結合位置
Date Issued
2010
Date
2010
Author(s)
邱莉媛
Abstract
RNA-binding proteins (RBPs) are vital for recognition sequences of ribonucleic acids, which is the genetic material that is derived from the DNA. For satisfying diverse functional requirements, RNA binding proteins are composed of multiple repeated blocks of RNA-binding domains presented in various structural arrangements to provide versatile functions. The ability to predict computationally RNA-binding residues in a RNA-binding protein can help biologists to have clues on site-directed mutagenesis in wet-lab experiments. “ProteRNA” is the proposed prediction framework in this thesis, combining Support Vector Machine (SVM) and WildSpan for identifying RNA-interacting residues in a RNA-binding protein. SVM utilizes PSSM and protein secondary structure information to predict, while WildSpan bases on conserved domain information. The performances of SVM predictor are F-score of 0.5127; however, the performances of the WildSpan hybrid predictor achieve F-score of 0.5362. In the independent testing dataset, ProteRNA has been able to deliver overall accuracy of 89.55 %, MCC of 0.2686, and F-score of 0.3185. ProteRNA surpasses the other web servers no matter in terms of accuracy, MCC, or F-score.
Subjects
Machine Learning
Support Vector Machine
RNA Binding Residues Prediction
Type
thesis
File(s)![Thumbnail Image]()
Loading...
Name
ntu-99-R97525034-1.pdf
Size
23.53 KB
Format
Adobe PDF
Checksum
(MD5):b0cb1d39148021b1996dd44693403da6
