https://scholars.lib.ntu.edu.tw/handle/123456789/558966
標題: | Training Code-Switching Language Model with Monolingual Data | 作者: | Chuang, S.-P. Sung, T.-W. HUNG-YI LEE |
關鍵字: | Code-Switching; Language Model | 公開日期: | 2020 | 卷: | 2020-May | 起(迄)頁: | 7949-7953 | 來源出版物: | ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings | 摘要: | Lack of code-switching data is an issue of training codeswitching language model. In this paper, we propose an approach to train code-switching language models with monolingual data only. By constraining and normalizing output projection matrix in RNN based language model, we make the embeddings of different languages close to each other. With the numerical and visualized results, we show that the proposed approaches remarkably improve the code-switching language modeling trained from monolingual data. The proposed approaches are comparable or even better than training code-switching language model with artificially generated code-switching data. Furthermore, we use unsupervised bilingual word translation to analyze if semantically equivalent words in different languages are mapped together. © 2020 IEEE. |
URI: | https://www.scopus.com/inward/record.url?eid=2-s2.0-85089239498&partnerID=40&md5=d7c9a430f1697ed819b84fd18bd19a33 https://scholars.lib.ntu.edu.tw/handle/123456789/558966 |
DOI: | 10.1109/ICASSP40776.2020.9053775 |
顯示於: | 電機工程學系 |
在 IR 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。