Improving End-to-end Taiwanese-Speech-to-Chinese-Text Translation by Semi-supervised Learning
Journal
ROCLING 2023 - Proceedings of the 35th Conference on Computational Linguistics and Speech Processing
ISBN
9789869576963
Date Issued
2023-01-01
Author(s)
Abstract
The main challenges in Taiwanese speech recognition are the lack of abundant and publicly available Taiwanese speech corpora, and the inconsistency in the written system of Taiwanese. The former results in insufficient data for speech recognition tasks, while the latter leads to inconsistent output formats and difficulties in interpretation. Therefore, this study takes the speech translation from Taiwanese speech to Chinese text as the task, and builds a speech translation model from Taiwanese speech to Chinese text by combining the pre-trained speech model with the architecture of the end-to-end deep learning model. Our method is based on a small amount of Taiwanese speech paired with Chinese text, and by collecting a large amount of unpaired Taiwanese speech data, and designing various algorithms to use a large amount of unpaired corpus to improve the system of translating Taiwanese speech into Chinese text. The research and discussion are mainly divided into four improvement directions: end-to-end speech translation model, pre-trained speech model features, iterative training method and corpus cleaning. Experimental results show that the above methods can effectively improve the translation performance of Taiwanese speech to Chinese text.
Subjects
Corpus cleaning | End-to-end speech translation | Semi-supervised learning
Type
conference paper
