Generative, discriminative, and ensemble learning on multi-modal perceptual fusion toward news video story segmentation
Journal
2004 IEEE International Conference on Multimedia and Expo (ICME)
Journal Volume
2
Pages
1091
ISBN
0780386035
Date Issued
2004-12-01
Author(s)
Chang, Shih Fu
Abstract
News video story segmentation is a critical task for automatic video indexing and summarization. Our prior work has demonstrated promising performance by using a generative model, called Maximum Entropy (ME), which models the posterior probability given the multi-modal perceptual features near the candidate points. In this paper, we investigate alternative statistical approaches based on discriminative models, i.e. Support Vector Machine (SVM), and Ensemble Learning, i.e. Boosting. In addition, we develop a novel approach, called BoostME, which uses the ME classifiers and the associated confidence scores in each boosting iteration. We evaluated these different methods using the TRECVID 2003 broadcast news data set. We found that SVM-based and ME-based techniques both outperformed the pure Boosting techniques, with the SVM-based solutions achieving even slightly higher accuracy. Moreover, we summarize extensive analysis results of error sources over distinctive news story types to identify future research opportunities.
Type
conference paper
