https://scholars.lib.ntu.edu.tw/handle/123456789/413060
標題: | Generative, discriminative, and ensemble learning on multi-modal perceptual fusion toward news video story segmentation | 作者: | HSU WINSTON Chang, Shih Fu |
公開日期: | 1-十二月-2004 | 卷: | 2 | 起(迄)頁: | 1091 | 來源出版物: | 2004 IEEE International Conference on Multimedia and Expo (ICME) | 摘要: | News video story segmentation is a critical task for automatic video indexing and summarization. Our prior work has demonstrated promising performance by using a generative model, called Maximum Entropy (ME), which models the posterior probability given the multi-modal perceptual features near the candidate points. In this paper, we investigate alternative statistical approaches based on discriminative models, i.e. Support Vector Machine (SVM), and Ensemble Learning, i.e. Boosting. In addition, we develop a novel approach, called BoostME, which uses the ME classifiers and the associated confidence scores in each boosting iteration. We evaluated these different methods using the TRECVID 2003 broadcast news data set. We found that SVM-based and ME-based techniques both outperformed the pure Boosting techniques, with the SVM-based solutions achieving even slightly higher accuracy. Moreover, we summarize extensive analysis results of error sources over distinctive news story types to identify future research opportunities. |
URI: | https://scholars.lib.ntu.edu.tw/handle/123456789/413060 | ISBN: | 0780386035 | DOI: | https://api.elsevier.com/content/abstract/scopus_id/11244288275 |
顯示於: | 資訊工程學系 |
在 IR 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。