MSD-1030: A well-built multi-sense evaluation dataset for sense representation models

Yen T.-Y;Lee Y.-Y;Shiue Y.-T;Huang H.-H;Chen H.-H.

標題:	MSD-1030: A well-built multi-sense evaluation dataset for sense representation models
作者:	Yen T.-Y Lee Y.-Y Shiue Y.-T Huang H.-H HSIN-HSI CHEN
關鍵字:	Benchmarking; Embeddings; Petroleum reservoir evaluation; Semantics; Benchmark datasets; Representation model; Semantic similarity; Word models; Word-pairs; Large dataset
公開日期:	2020
起(迄)頁:	5802-5809
來源出版物:	LREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings
摘要:	Sense embedding models handle polysemy by giving each distinct meaning of a word form a separate representation. They are considered improvements over word models, and their effectiveness is usually judged with benchmarks such as semantic similarity datasets. However, most of these datasets are not designed for evaluating sense embeddings. In this research, we show that there are at least six concerns about evaluating sense embeddings with existing benchmark datasets, including the large proportions of single-sense words and the unexpected inferior performance of several multi-sense models to their single-sense counterparts. These observations call into serious question whether evaluations based on these datasets can reflect the sense model's ability to capture different meanings. To address the issues, we propose the Multi-Sense Dataset (MSD-1030), which contains a high ratio of multi-sense word pairs. A series of analyses and experiments show that MSD-1030 serves as a more reliable benchmark for sense embeddings. The dataset is available at http://nlg.csie.ntu.edu.tw/nlpresource/MSD-1030/. ? European Language Resources Association (ELRA), licensed under CC-BY-NC
URI:	https://www.scopus.com/inward/record.uri?eid=2-s2.0-85096545131&partnerID=40&md5=60cfeda24d215d916d5b219db4b7001a https://scholars.lib.ntu.edu.tw/handle/123456789/581350
顯示於：	資訊工程學系

顯示文件完整紀錄

Page view(s)

checked on 2024/4/27

Google Scholar^TM

檢查

TAIR相關文章

Page view(s)

Google ScholarTM

Google Scholar^TM