https://scholars.lib.ntu.edu.tw/handle/123456789/413140
標題: | A simplification-translation-restoration framework for cross-domain SMT applications | 作者: | Chen H.-B. Huang H.-H. Chen H.-H. Tan C.-T. |
關鍵字: | Cross-domain SMT;Domain adaptation;Statistical machine translation | 公開日期: | 2012 | 起(迄)頁: | 545-560 | 來源出版物: | 24th International Conference on Computational Linguistics | 摘要: | Integration of domain specific knowledge into a general purpose statistical machine translation (SMT) system poses challenges due to insufficient bilingual corpora. In this paper we propose a simplification-translation-restoration (STR) framework for domain adaptation in SMT by simplifying domain specific segments of a text. For an in-domain text, we identify the critical segments and modify them to alleviate the data sparseness problem in the out-domain SMT system. After we receive the translation result, these critical segments are then restored according to the provided in-domain knowledge. We conduct experiments on an English-to- Chinese translation task in the medical domain and evaluate each step of the STR framework. The translation results show significant improvement of our approach over the out-domain and the na?ve in-domain SMT systems. ? 2012 The COLING. |
URI: | https://scholars.lib.ntu.edu.tw/handle/123456789/413140 |
顯示於: | 資訊工程學系 |
在 IR 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。