Investigating the Reordering Capability in CTC-based Non-Autoregressive End-to-End Speech Translation
Journal
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021
Pages
1068-1077
Date Issued
2021
Author(s)
Abstract
We study the possibilities of building a non-autoregressive speech-to-text translation model using connectionist temporal classification (CTC), and use CTC-based automatic speech recognition as an auxiliary task to improve the performance. CTC's success on translation is counter-intuitive due to its monotonicity assumption, so we analyze its reordering capability. Kendall's tau distance is introduced as the quantitative metric, and gradient-based visualization provides an intuitive way to take a closer look into the model. Our analysis shows that transformer encoders have the ability to change the word order and points out the future research direction that worth being explored more on non-autoregressive speech translation. ? 2021 Association for Computational Linguistics
Subjects
Character recognition
Classification (of information)
Computational linguistics
Natural language processing systems
Speech
Text processing
Translation (languages)
Auto-regressive
Automatic speech recognition
End to end
Kendall taus
Monotonicity
Performance
Speech translation
Temporal classification
Temporal use
Translation models
Speech recognition
Type
conference paper
