3D Facial Modeling and Animation with Speech / Lip Synchronization for Human-Robot Interactions

Huang,  Chien-Chieh

3D Facial Modeling and Animation with Speech / Lip Synchronization for Human-Robot Interactions

Date Issued

2011

Date

2011

Author(s)

Huang, Chien-Chieh

URI

http://ntur.lib.ntu.edu.tw//handle/246246/253961

Abstract

In 21st century, the intelligent robotics becomes one of the most essential industries all over the world. There are many intelligent robotics institutions develop modern and multi-functional robots in many types, for example, wheel robot, and biped robot. With the growing of elders and economic pressure of present society, most of the parents both have to work for their family. Because of this phenomenon, we made an application for the children and the elders. Human-robot interaction (HRI) is an important technology in intelligent robotics field. In this thesis, we use sound and voice as commands to communicate with robots. It consists of two major parts, namely, head modeling and speech processing. Synchronization between speech and mouth shape includes technologies, such as computer vision, speech synthesis, and speech recognition. We present a method to synchronize the lip movement and the speech, and we use Microsoft’s Speech Application Programming Interface (SAPI) as the speech synthesis and recognition tool. Speech animation includes two components, the speech and the image. Speech synthesis output is obtained from Text-to-Speech (TTS), and the images of visemes are generated from software, FaceGen Modeller. Import three key pictures to this software to calibrate and generate the face model. The viseme event handler in C# will connect the image of mouth shape and viseme together. Load the images sequentially and the visemes will one by one match with the images correctly. The main applications of speech synthesis are used as assistive devices, e.g. the use of screen readers for people with visual impairment. A mute person can take advantage of this technology to talk to others. In recent years, speech synthesis is extensively applied in service robotics and entertainment productions such as language learning, education, video games, animations, and music videos. Finally, we build a quick method to make a 3D head model and synchronize it with speech. This application can be used to educate children English reading and listening. For some specific people, like mute people and deaf people, this application can be used as a communication tool.

Subjects

lip synchronization

speech recognition

speech synthesis

3D head model

facial animation

Type

thesis

File(s)

Name

ntu-100-R97921047-1.pdf

Size

23.32 KB

Format

Adobe PDF

Checksum

(MD5):0b72e7d45309bcc36adfa9f67cb4e582

3D Facial Modeling and Animation with Speech / Lip Synchronization for Human-Robot Interactions

關於 (About)

聯絡資訊 (Contact Us)

相關網站 (Useful Links)

關於開放取用 (Open Access, OA)

出版社期刊論文授權政策 (Copyright)

使用說明 (Instructions)

登入說明 (Sign-in)

匯入著作 (Submission)