Cluse: Cross-lingual unsupervised sense embeddings

Chi, T.-C.; YUN-NUNG CHEN

Cluse: Cross-lingual unsupervised sense embeddings

Journal

Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018

Pages

271-281

Date Issued

2020

Author(s)

Chi, T.-C.

YUN-NUNG CHEN

URI

https://www.scopus.com/inward/record.url?eid=2-s2.0-85081740727&partnerID=40&md5=2bf8a73c0277f58d64d312ca91f8f472

https://scholars.lib.ntu.edu.tw/handle/123456789/559428

Abstract

This paper proposes a modularized sense induction and representation learning model that jointly learns bilingual sense embeddings that align well in the vector space, where the crosslingual signal in the English-Chinese parallel corpus is exploited to capture the collocation and distributed characteristics in the language pair. The model is evaluated on the Stanford Contextual Word Similarity (SCWS) dataset to ensure the quality of monolingual sense embeddings. In addition, we introduce Bilingual Contextual Word Similarity (BCWS), a large and high-quality dataset for evaluating crosslingual sense embeddings, which is the first attempt of measuring whether the learned embeddings are indeed aligned well in the vector space. The proposed approach shows the superior quality of sense embeddings evaluated in both monolingual and bilingual spaces.1 © 2018 Association for Computational Linguistics

Other Subjects

Embeddings; Large dataset; Natural language processing systems; Vector spaces; Contextual words; Cross-lingual; Distributed characteristics; English-chinese parallel corpora; High quality; Language pairs; Learning models; Modularized; Quality control

Type

conference paper

Cluse: Cross-lingual unsupervised sense embeddings

關於 (About)

聯絡資訊 (Contact Us)

相關網站 (Useful Links)

關於開放取用 (Open Access, OA)

出版社期刊論文授權政策 (Copyright)

使用說明 (Instructions)

登入說明 (Sign-in)

匯入著作 (Submission)