https://scholars.lib.ntu.edu.tw/handle/123456789/632417
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.author | Yang K.-C | en-US |
dc.contributor.author | Ho T.-H | en-US |
dc.contributor.author | Lin J.-S | en-US |
dc.contributor.author | LIN-SHAN LEE | en-US |
dc.creator | Yang K.-C;Ho T.-H;Lin J.-S;Lee L.-S. | - |
dc.date.accessioned | 2023-06-09T07:52:59Z | - |
dc.date.available | 2023-06-09T07:52:59Z | - |
dc.date.issued | 1997 | - |
dc.identifier.uri | https://www.scopus.com/inward/record.uri?eid=2-s2.0-85121257022&partnerID=40&md5=8a1b12b844f4633cdb46d8d405ba27a6 | - |
dc.identifier.uri | https://scholars.lib.ntu.edu.tw/handle/123456789/632417 | - |
dc.description.abstract | In this paper we present a novel approach to truncate combined word-based and class-based n-gram language model using Kullback-Leibler distance criterion. First, we investigate a reliable backoff scheme for unseen n-gram using class-based language model, which outperforms conventional approaches using (n-l)-gram in perplexity for both training and testing data. As for the language model truncation, our approach uses dynamic thresholds for different words or word contexts determined by the Kullback-Leibler distance criterion, as opposed to the conventional scheme which truncates the language model by a constant threshold. In our experiments, 80% of the parameters are reduced by using the combined word-based and class-based n-gram language model and the Kullback-Leibler distance truncation criterion, while the perplexity only increases 1.6%, as compared with the word bigram language model without any truncation. © 1997 Proceedings of the 10th Research on Computational Linguistics International Conference, ROCLING 1997. All rights reserved. | - |
dc.relation.ispartof | Proceedings of the 10th Research on Computational Linguistics International Conference, ROCLING 1997 | - |
dc.subject.other | Backoffs; Class-based; Class-based language model; Conventional approach; Distance criterion; Kullback-Leibler distance; N-gram language models; N-grams; Training and testing; Computational linguistics | - |
dc.title | Truncation on combined word-based and class-based language model using kullback-leibler distance criterion | en_US |
dc.type | conference paper | en |
dc.identifier.scopus | 2-s2.0-85121257022 | - |
dc.relation.pages | 335-344 | - |
item.fulltext | no fulltext | - |
item.openairecristype | http://purl.org/coar/resource_type/c_5794 | - |
item.cerifentitytype | Publications | - |
item.openairetype | conference paper | - |
item.grantfulltext | none | - |
crisitem.author.dept | Networking and Multimedia | - |
crisitem.author.dept | Computer Science and Information Engineering | - |
crisitem.author.dept | Electrical Engineering | - |
crisitem.author.dept | Communication Engineering | - |
crisitem.author.dept | MediaTek-NTU Research Center | - |
crisitem.author.dept | Center for Artificial Intelligence and Advanced Robotics | - |
crisitem.author.orcid | 0000-0002-6039-8298 | - |
crisitem.author.parentorg | College of Electrical Engineering and Computer Science | - |
crisitem.author.parentorg | College of Electrical Engineering and Computer Science | - |
crisitem.author.parentorg | College of Electrical Engineering and Computer Science | - |
crisitem.author.parentorg | College of Electrical Engineering and Computer Science | - |
crisitem.author.parentorg | Others: University-Level Research Centers | - |
crisitem.author.parentorg | Others: University-Level Research Centers | - |
顯示於: | 電機工程學系 |
在 IR 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。