Unseen filler generalization in attention-based natural language reasoning models

Chen C.-H;Fu Y.-F;Cheng H.-H;Lin S.-D.

標題:	Unseen filler generalization in attention-based natural language reasoning models
作者:	Chen C.-H Fu Y.-F Cheng H.-H SHOU-DE LIN
關鍵字:	Attention; Machine reasoning; Memory-augmented neural network; Transformer; Unseen filler
公開日期:	2020
起(迄)頁:	42-51
來源出版物:	Proceedings - 2020 IEEE 2nd International Conference on Cognitive Machine Intelligence, CogMI 2020
摘要:	Recent natural language reasoning models have achieved human-level accuracy on several benchmark datasets such as bAbI. While the results are impressive, in this paper we argue by experiment analysis that several existing attention-based models have a hard time generalizing themselves to handle name entities not seen in the training data. We thus propose Unseen Filler Generalization (UFG) as a task along with two new datasets to evaluate the filler generalization capability of a natural language reasoning model. We also propose a simple yet general strategy that can be applied to various models to handle the UFG challenge through modifying the entity occurrence distribution in the training data. Such strategy allows the model to encounter unseen entities during training, and thus not to overfit to only a few specific name entities. Our experiments show that this strategy can significantly boost the filler generalization capability of three existing models including Entity Network, Working Memory Network, and Universal Transformers. ? 2020 IEEE.
URI:	https://www.scopus.com/inward/record.uri?eid=2-s2.0-85100602224&doi=10.1109%2fCogMI50398.2020.00016&partnerID=40&md5=7ebb31d57c730cbaf68eb36a499fac9b https://scholars.lib.ntu.edu.tw/handle/123456789/581448
DOI:	10.1109/CogMI50398.2020.00016
SDG/關鍵字:	Attention; Machine reasoning; Memory-augmented neural network; Transformer; Unseen filler
顯示於：	資訊工程學系

顯示文件完整紀錄

SCOPUS^TM
Citations

checked on 2023/11/13

Page view(s)

checked on 2024/4/20

Google Scholar^TM

檢查

Altmetric

TAIR相關文章

SCOPUSTM Citations

Page view(s)

Google ScholarTM

Altmetric

Altmetric

SCOPUS^TM
Citations

Google Scholar^TM