https://scholars.lib.ntu.edu.tw/handle/123456789/641772
標題: | 16S-ITGDB: An Integrated Database for Improving Species Classification of Prokaryotic 16S Ribosomal RNA Sequences | 作者: | Hsieh, Yu-Peng Hung, Yuan-Mao Tsai, Mong-Hsun LIANG-CHUAN LAI Chuang, Eric Y |
關鍵字: | 16S full length; 16S rRNA (16S rDNA); ITGDB; metagenomics 16S; sequence classification; taxonomy assignment; third-generation sequencing | 公開日期: | 2022 | 卷: | 2 | 來源出版物: | Frontiers in bioinformatics | 摘要: | Analyzing 16S ribosomal RNA (rRNA) sequences allows researchers to elucidate the prokaryotic composition of an environment. In recent years, third-generation sequencing technology has provided opportunities for researchers to perform full-length sequence analysis of bacterial 16S rRNA. RDP, SILVA, and Greengenes are the most widely used 16S rRNA databases. Many 16S rRNA classifiers have used these databases as a reference for taxonomic assignment tasks. However, some of the prokaryotic taxonomies only exist in one of the three databases. Furthermore, Greengenes and SILVA include a considerable number of taxonomies that do not have the resolution to the species level, which has limited the classifiers' performance. In order to improve the accuracy of taxonomic assignment at the species level for full-length 16S rRNA sequences, we manually curated the three databases and removed the sequences that did not have a species name. We then established a taxonomy-based integrated database by considering both taxonomies and sequences from all three 16S rRNA databases and validated it by a mock community. Results showed that our taxonomy-based integrated database had improved taxonomic resolution to the species level. The integrated database and the related datasets are available at https://github.com/yphsieh/ItgDB. |
URI: | https://scholars.lib.ntu.edu.tw/handle/123456789/641772 | ISSN: | 2673-7647 | DOI: | 10.3389/fbinf.2022.905489 |
顯示於: | 生理學科所 |
在 IR 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。