Automatic video caption detection and extraction in the DCT compressed domain
Resource
International Conference on Visual Communications and Image Processing, (VCIP 2005)
Journal
Proceedings of SPIE - The International Society for Optical Engineering
Journal Volume
5960
Journal Issue
2
Pages
895-907
Date Issued
2005
Author(s)
Abstract
The text in a video frame can help us to understand the semantics of video content directly. Although there are many approaches that can automatically detect and localize text a video, most of them use the original pixels of an image to find the text regions. In this paper, we present an approach to automatically localize captions in MPEG compressed videos. Caption regions are segmented from background by using their distinguishing texture characteristics. Unlike previously published ones which fully decompress the video sequence before extracting the caption regions or only extract text regions in Intra-(I-) frames, our approach detect and localize caption regions directly in the DCT compressed domain. Therefore, only very small amounts of decoding processes are required. Experiments show that a good caption detection rate can be obtained, and the average recalls of Intra- and Inter-frame detections are 97.77% and 97.84%, respectively.
Subjects
Caption; Compressed domain
Type
conference paper
