Li Y.-CLin M.-JHuang X.-XCHIEN-YU CHENYI-CHANG LU2021-07-262021-07-262020https://www.scopus.com/inward/record.uri?eid=2-s2.0-85099587402&doi=10.1109%2fBIBE50027.2020.00012&partnerID=40&md5=5c0dd2b2cd5d8f8824933297d2c64f50https://scholars.lib.ntu.edu.tw/handle/123456789/573117Homology-based transfer is frequently used to predict protein functions of unannotated sequences through similarity analysis between the target and previously annotated sequences. The most direct and accessible homology-based transfer approach is sequence alignment. To assess the reliability of alignment-based prediction, we applied a 10-fold cross-validation test in SWISS-Prot database. We compared Matthews correlation coefficient, sensitivity, as well as precision, and examined different parameter settings used in the alignment-based methods, with BLASTp and PSI-BLAST. As the results shown in this paper, in the categories of domain, ligand, molecular function, biological process, cellular component, and PTM, the keywords can be confidently used for protein function predictions, whereas the others are less reliable. ? 2020 IEEE.Bioinformatics; Forecasting; 10-fold cross-validation; Automatic annotation; Cellular components; Correlation coefficient; Molecular function; Protein function prediction; Sequence alignments; Similarity analysis; Proteins[SDGs]SDG3Comprehensive Study of Keywords for Sequence-Based Automatic Annotation of Protein Functionsconference paper10.1109/BIBE50027.2020.000122-s2.0-85099587402