3D Facial Modeling and Animation with Speech / Lip Synchronization for Human-Robot Interactions
Date Issued
2011
Date
2011
Author(s)
Huang, Chien-Chieh
Abstract
In 21st century, the intelligent robotics becomes one of the most essential industries all over the world. There are many intelligent robotics institutions develop modern and multi-functional robots in many types, for example, wheel robot, and biped robot. With the growing of elders and economic pressure of present society, most of the parents both have to work for their family. Because of this phenomenon, we made an application for the children and the elders.
Human-robot interaction (HRI) is an important technology in intelligent robotics field. In this thesis, we use sound and voice as commands to communicate with robots. It consists of two major parts, namely, head modeling and speech processing.
Synchronization between speech and mouth shape includes technologies, such as computer vision, speech synthesis, and speech recognition. We present a method to synchronize the lip movement and the speech, and we use Microsoft’s Speech Application Programming Interface (SAPI) as the speech synthesis and recognition tool. Speech animation includes two components, the speech and the image. Speech synthesis output is obtained from Text-to-Speech (TTS), and the images of visemes are generated from software, FaceGen Modeller.
Import three key pictures to this software to calibrate and generate the face model. The viseme event handler in C# will connect the image of mouth shape and viseme together. Load the images sequentially and the visemes will one by one match with the images correctly.
The main applications of speech synthesis are used as assistive devices, e.g. the use of screen readers for people with visual impairment. A mute person can take advantage of this technology to talk to others. In recent years, speech synthesis is extensively applied in service robotics and entertainment productions such as language learning, education, video games, animations, and music videos.
Finally, we build a quick method to make a 3D head model and synchronize it with speech. This application can be used to educate children English reading and listening. For some specific people, like mute people and deaf people, this application can be used as a communication tool.
Subjects
lip synchronization
speech recognition
speech synthesis
3D head model
facial animation
Type
thesis
File(s)![Thumbnail Image]()
Loading...
Name
ntu-100-R97921047-1.pdf
Size
23.32 KB
Format
Adobe PDF
Checksum
(MD5):0b72e7d45309bcc36adfa9f67cb4e582
