https://scholars.lib.ntu.edu.tw/handle/123456789/640018
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.author | Chang, Kai Wei | en_US |
dc.contributor.author | Chen, Ming Hsin | en_US |
dc.contributor.author | Lin, Yun Ping | en_US |
dc.contributor.author | Hsu, Jing Neng | en_US |
dc.contributor.author | Huang, Paul Kuo Ming | en_US |
dc.contributor.author | Huang, Chien Yu | en_US |
dc.contributor.author | Li, Shang Wen | en_US |
dc.contributor.author | HUNG-YI LEE | en_US |
dc.date.accessioned | 2024-02-29T05:59:59Z | - |
dc.date.available | 2024-02-29T05:59:59Z | - |
dc.date.issued | 2023-01-01 | - |
dc.identifier.isbn | 9798350306897 | - |
dc.identifier.uri | https://scholars.lib.ntu.edu.tw/handle/123456789/640018 | - |
dc.description.abstract | Prompting and adapter tuning have emerged as efficient alternatives to fine-tuning (FT) methods. However, existing studies on speech prompting focused on classification tasks and failed on more complex sequence generation tasks. Besides, adapter tuning is primarily applied with a focus on encoder-only self-supervised models. Our experiments show that prompting on Wav2Seq, a self-supervised encoder-decoder model, surpasses previous works in sequence generation tasks. It achieves a remarkable 53% relative improvement in word error rate for ASR and a 27% in F1 score for slot filling. Additionally, prompting competes with the FT method in the low-resource scenario. Moreover, we show the transferability of prompting and adapter tuning on Wav2Seq in cross-lingual ASR. When limited trainable parameters are involved, prompting and adapter tuning consistently outperform conventional FT across 7 languages. Notably, in the low-resource scenario, prompting consistently outperforms adapter tuning. | en_US |
dc.relation.ispartof | 2023 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2023 | en_US |
dc.subject | adapter | automatic speech recognition | parameter-efficient tuning | Prompting | sequence generation | en_US |
dc.title | Prompting and Adapter Tuning For Self-Supervised Encoder-Decoder Speech Model | en_US |
dc.type | conference paper | en_US |
dc.identifier.doi | 10.1109/ASRU57964.2023.10389731 | - |
dc.identifier.scopus | 2-s2.0-85184666668 | - |
dc.identifier.url | https://api.elsevier.com/content/abstract/scopus_id/85184666668 | - |
dc.relation.pageend | 8 | en_US |
item.openairetype | conference paper | - |
item.openairecristype | http://purl.org/coar/resource_type/c_5794 | - |
item.fulltext | no fulltext | - |
item.grantfulltext | none | - |
item.cerifentitytype | Publications | - |
crisitem.author.dept | Electrical Engineering | - |
crisitem.author.dept | Intel-NTU Connected Context Computing Center | - |
crisitem.author.dept | Communication Engineering | - |
crisitem.author.dept | Computer Science and Information Engineering | - |
crisitem.author.dept | Networking and Multimedia | - |
crisitem.author.dept | Center for Artificial Intelligence and Advanced Robotics | - |
crisitem.author.dept | Master's Program in Smart Medicine and Health Informatics (SMARTMHI) | - |
crisitem.author.orcid | 0000-0002-9654-5747 | - |
crisitem.author.parentorg | College of Electrical Engineering and Computer Science | - |
crisitem.author.parentorg | Others: University-Level Research Centers | - |
crisitem.author.parentorg | Others: International Research Centers | - |
crisitem.author.parentorg | College of Electrical Engineering and Computer Science | - |
crisitem.author.parentorg | College of Electrical Engineering and Computer Science | - |
crisitem.author.parentorg | College of Electrical Engineering and Computer Science | - |
crisitem.author.parentorg | Others: University-Level Research Centers | - |
crisitem.author.parentorg | International College | - |
顯示於: | 電機工程學系 |
在 IR 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。