Tone Labeling by Deep Learning-based Tone Recognizer for Mandarin Speech

Wu-Hao Li; Chen-Yu Chiang; TE-HSIN LIU

doi:10.1109/apsipaasc58517.2023.10317518

Tone Labeling by Deep Learning-based Tone Recognizer for Mandarin Speech

Journal

2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)

Start Page

873

End Page

880

Date Issued

2023-10-31

Author(s)

Wu-Hao Li

Chen-Yu Chiang

TE-HSIN LIU

DOI

10.1109/apsipaasc58517.2023.10317518

URI

https://scholars.lib.ntu.edu.tw/handle/123456789/720012

https://www.scopus.com/inward/record.uri?eid=2-s2.0-85180010068&doi=10.1109%2fAPSIPAASC58517.2023.10317518&partnerID=40&md5=c1092f96e0f13746611270d24226e075

Abstract

Tone labeling of tone sandhi and polyphones is crucial when preparing a high-quality speech corpus for constructing a Mandarin text-to-speech system. Correct tone labeling may ensure that the constructed text-to-speech system can generate a natural prosody. This paper proposes tone labeling using an iterative method with a deep learning-based tone recognizer. The experimental results showed that the proposed method could robustly label tones for syllables of tone sandhi and polyphones on a multi-speaking rate Mandarin speech corpus. Furthermore, this study found that syllables misrecognized as different tones from lexical tones may reflect the true tone realizations caused by coarticulation, location in a prosodic structure, and speaking rates. This study also provided a quantitative analysis of the relationship between labeled tones and prosodic structure to conform to the characteristics found in previous linguistic studies. © 2023 IEEE.

SDGs

[SDGs]SDG4

Publisher

IEEE

Type

conference paper

Tone Labeling by Deep Learning-based Tone Recognizer for Mandarin Speech

關於 (About)

聯絡資訊 (Contact Us)

相關網站 (Useful Links)

關於開放取用 (Open Access, OA)

出版社期刊論文授權政策 (Copyright)

使用說明 (Instructions)

登入說明 (Sign-in)

匯入著作 (Submission)