Visual cue cluster construction via information bottleneck principle and kernel density estimation
Journal
Lecture Notes in Computer Science
Journal Volume
3568
Pages
82
Date Issued
2005-10-17
Author(s)
Chang, Shih Fu
Abstract
Recent research in video analysis has shown a promising direction, in which mid-level features (e.g., people, anchor, indoor) are abstracted from low-level features (e.g., color, texture, motion, etc.) and used for discriminative classification of semantic labels. However, in most systems, such mid-level features are selected manually. In this paper, we propose an information-theoretic framework, visual cue cluster construction (VC3), to automatically discover adequate mid-level features. The problem is posed as mutual information maximization, through which optimal cue clusters are discovered to preserve the highest information about the semantic labels. We extend the Information Bottleneck frame-work to high-dimensional continuous features and further propose a projection method to map each video into probabilistic memberships over all the cue clusters. The biggest advantage of the proposed approach is to remove the dependence on the manual process in choosing the midlevel features and the huge labor cost involved in annotating the training corpus for training the detector of each mid-level feature. The proposed VC3 framework is general and effective, leading to exciting potential in solving other problems of semantic video analysis. When tested in news video story segmentation, the proposed approach achieves promising performance gain over representations derived from conventional clustering techniques and even the mid-level features selected manually. © Springer-Verlag Berlin Heidelberg 2005.
Type
conference paper
File(s)![Thumbnail Image]()
Loading...
Name
hsu05visual.pdf
Size
119.72 KB
Format
Adobe PDF
Checksum
(MD5):ba94506b143ab29b36440fe0c2fe523f