https://scholars.lib.ntu.edu.tw/handle/123456789/632064
標題: | Efficient Multi-Task Auxiliary Learning: Selecting Auxiliary Data by Feature Similarity | 作者: | Kung P.-N Chen Y.-C Yin S.-S Yang T.-H YUN-NUNG CHEN |
公開日期: | 2021 | 起(迄)頁: | 416-428 | 來源出版物: | EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings | 摘要: | Multi-task auxiliary learning utilizes a set of relevant auxiliary tasks to improve the performance of a primary task. A common usage is to manually select multiple auxiliary tasks for multi-task learning on all data, which raises two issues: (1) selecting beneficial auxiliary tasks for a primary task is nontrivial; (2) when the auxiliary datasets are large, training on all data becomes time-expensive and impractical. Therefore, this paper focuses on addressing these problems and proposes a time-efficient sampling method to select the data that is most relevant to the primary task. The proposed method allows us to only train on the most beneficial sub-datasets from the auxiliary tasks, achieving efficient multi-task auxiliary learning. The experiments on three benchmark datasets (RTE, MRPC, STS-B) show that our method significantly outperforms random sampling and ST-DNN. Also, by applying our method, the model can surpass fully-trained MT-DNN on RTE, MRPC, STS-B, using only 50%, 66%, and 1% of data, respectively. © 2021 Association for Computational Linguistics |
URI: | https://www.scopus.com/inward/record.uri?eid=2-s2.0-85127417086&partnerID=40&md5=8b2fcd242c781ea371e6b668f539029c https://scholars.lib.ntu.edu.tw/handle/123456789/632064 |
SDG/關鍵字: | Computational linguistics; Large dataset; Auxiliary data; Benchmark datasets; Efficient sampling; Multi tasks; Performance; Primary task; Random sampling; Sampling method; Time-efficient; Learning systems |
顯示於: | 資訊工程學系 |
在 IR 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。