Automatic facial image annotation and retrieval by integrating voice label and visual appearance

Jheng H.-W.; Chen B.-C.; Chen Y.-Y.; WINSTON HSU; Jheng H.-W.;Chen B.-C.;Chen Y.-Y.;Hsu W.

doi:10.1145/2647868.2655015

Automatic facial image annotation and retrieval by integrating voice label and visual appearance

Journal

2014 ACM Conference on Multimedia

Pages

1001-1004

ISBN

9781450330633

Date Issued

2014

Author(s)

Jheng H.-W.

Chen B.-C.

Chen Y.-Y.

WINSTON HSU

DOI

10.1145/2647868.2655015

URI

https://scholars.lib.ntu.edu.tw/handle/123456789/413002

Abstract

Annotation is important for managing and retrieving a large amount of photos, but it is generally labor-intensive and time-consuming. However, speaking while taking photos is straightforward and effortless, and using voice for annotation is faster than typing words. To best reduce the manual cost of annotating photos, we propose a novel framework which utilizes the scarce spoken annotations recorded while capturing as voice labels and automatically label every facial image in the photo collection. To accomplish this goal, we employ a probabilistic graphical model which integrates voice labels and visual appearances for inference. Combined with group prior estimation and gender attribute association, we can achieve an outstanding performance on the proposed synthesized group photo collections.

Subjects

Face annotation; Image retrieval; Spoken annotation

SDGs

[SDGs]SDG5

Type

conference paper

Automatic facial image annotation and retrieval by integrating voice label and visual appearance

關於 (About)

聯絡資訊 (Contact Us)

相關網站 (Useful Links)

關於開放取用 (Open Access, OA)

出版社期刊論文授權政策 (Copyright)

使用說明 (Instructions)

登入說明 (Sign-in)

匯入著作 (Submission)