Comprehensive Study of Keywords for Sequence-Based Automatic Annotation of Protein Functions
Journal
Proceedings - IEEE 20th International Conference on Bioinformatics and Bioengineering, BIBE 2020
Pages
23-28
Date Issued
2020
Author(s)
Abstract
Homology-based transfer is frequently used to predict protein functions of unannotated sequences through similarity analysis between the target and previously annotated sequences. The most direct and accessible homology-based transfer approach is sequence alignment. To assess the reliability of alignment-based prediction, we applied a 10-fold cross-validation test in SWISS-Prot database. We compared Matthews correlation coefficient, sensitivity, as well as precision, and examined different parameter settings used in the alignment-based methods, with BLASTp and PSI-BLAST. As the results shown in this paper, in the categories of domain, ligand, molecular function, biological process, cellular component, and PTM, the keywords can be confidently used for protein function predictions, whereas the others are less reliable. ? 2020 IEEE.
Subjects
Bioinformatics; Forecasting; 10-fold cross-validation; Automatic annotation; Cellular components; Correlation coefficient; Molecular function; Protein function prediction; Sequence alignments; Similarity analysis; Proteins
Type
conference paper