Truncation on combined word-based and class-based language model using kullback-leibler distance criterion

Yang K.-C;Ho T.-H;Lin J.-S;Lee L.-S.

DC 欄位	值	語言
dc.contributor.author	Yang K.-C	en-US
dc.contributor.author	Ho T.-H	en-US
dc.contributor.author	Lin J.-S	en-US
dc.contributor.author	LIN-SHAN LEE	en-US
dc.creator	Yang K.-C;Ho T.-H;Lin J.-S;Lee L.-S.	-
dc.date.accessioned	2023-06-09T07:52:59Z	-
dc.date.available	2023-06-09T07:52:59Z	-
dc.date.issued	1997	-
dc.identifier.uri	https://www.scopus.com/inward/record.uri?eid=2-s2.0-85121257022&partnerID=40&md5=8a1b12b844f4633cdb46d8d405ba27a6	-
dc.identifier.uri	https://scholars.lib.ntu.edu.tw/handle/123456789/632417	-
dc.description.abstract	In this paper we present a novel approach to truncate combined word-based and class-based n-gram language model using Kullback-Leibler distance criterion. First, we investigate a reliable backoff scheme for unseen n-gram using class-based language model, which outperforms conventional approaches using (n-l)-gram in perplexity for both training and testing data. As for the language model truncation, our approach uses dynamic thresholds for different words or word contexts determined by the Kullback-Leibler distance criterion, as opposed to the conventional scheme which truncates the language model by a constant threshold. In our experiments, 80% of the parameters are reduced by using the combined word-based and class-based n-gram language model and the Kullback-Leibler distance truncation criterion, while the perplexity only increases 1.6%, as compared with the word bigram language model without any truncation. © 1997 Proceedings of the 10th Research on Computational Linguistics International Conference, ROCLING 1997. All rights reserved.	-
dc.relation.ispartof	Proceedings of the 10th Research on Computational Linguistics International Conference, ROCLING 1997	-
dc.subject.other	Backoffs; Class-based; Class-based language model; Conventional approach; Distance criterion; Kullback-Leibler distance; N-gram language models; N-grams; Training and testing; Computational linguistics	-
dc.title	Truncation on combined word-based and class-based language model using kullback-leibler distance criterion	en_US
dc.type	conference paper	en
dc.identifier.scopus	2-s2.0-85121257022	-
dc.relation.pages	335-344	-
item.fulltext	no fulltext	-
item.openairecristype	http://purl.org/coar/resource_type/c_5794	-
item.cerifentitytype	Publications	-
item.openairetype	conference paper	-
item.grantfulltext	none	-
crisitem.author.dept	Networking and Multimedia	-
crisitem.author.dept	Computer Science and Information Engineering	-
crisitem.author.dept	Electrical Engineering	-
crisitem.author.dept	Communication Engineering	-
crisitem.author.dept	MediaTek-NTU Research Center	-
crisitem.author.dept	Center for Artificial Intelligence and Advanced Robotics	-
crisitem.author.orcid	0000-0002-6039-8298	-
crisitem.author.parentorg	College of Electrical Engineering and Computer Science	-
crisitem.author.parentorg	College of Electrical Engineering and Computer Science	-
crisitem.author.parentorg	College of Electrical Engineering and Computer Science	-
crisitem.author.parentorg	College of Electrical Engineering and Computer Science	-
crisitem.author.parentorg	Others: University-Level Research Centers	-
crisitem.author.parentorg	Others: University-Level Research Centers	-
顯示於：	電機工程學系

顯示文件簡單紀錄

Page view(s)

checked on 2024/5/18

Google Scholar^TM

檢查

TAIR相關文章

Page view(s)

Google ScholarTM

Google Scholar^TM