Algorithm and Architecture Design for 3D Video Signal Processing
Date Issued
2010
Date
2010
Author(s)
Cheng, Chao-Chung
Abstract
Digital video technology has played an important role in our daily life. With the evolution of the display technologies, display systems can provide higher visual quality to enrich human life. Emerging 3D displays provide better visual experience than conventional 2D displays. 3D technology enriches the contents of many applications, such as broadcasting, movie, gaming, photographing, camcorder, education, etc. In this dissertation, the video signal conversion for 3D image and video are discussed in two different parts: depth from stereo vision and single view video 2D-to-3D conversion. The depth from stereo vision estimate depth from the correspondences of stereo views. The 2D-to-3D conversion generates the depth map of 2D video, and then uses the depth map to render 2D video to 3D video.
Stereo matching can be formulated as an energy minimization problem on a 2D MRF. Among many MRF global optimization method, belief propagation gives high quality and has highly potential to achieve real-time processing. However, because of costly iterative operations and high memory and bandwidth demand, algorithms such as belief propagation conventionally used for stereo matching are computationally expensive for real-time system implementation. In Part I, the background of stereo matching using belief propagation is first described. Second, two kinds of algorithms, called tile-based belief propagation and fast message computation algorithm, which reduce the complexity of the bandwidth, memory, and computation of general BP are proposed to make the real-time processing become possible. Third, an efficient VLSI architecture of real-time, high-performance stereo matching is presented. The design combines the fast message computation method with the tile-based BP to create a parallel and flexible architecture. The VLSI architecture benefits from the proposed hardware design techniques that help reduce the bandwidth consumption and improve the efficiency of stereo matching. These techniques include a 3-stage pipeline, fully-parallel processing elements for message update, and a boundary message reuse scheme. When operating at 227 MHz, the architecture can generate HDTV720p disparity maps at 30 fps.
In Part II, we try to generate depth map from single view content. Three kinds of algorithms are proposed. The first algorithm uses three depth cues based on motion parallax, geometrical perspective, and color. The depth cue based algorithm is computation extensive. Therefore, the second algorithm uses a new concept that applies a prior hypothesis to assign the depth of grouped object without doing the depth cue extraction. The algorithm is suitable for single 2D image. Finally, the third algorithm which uses the human depth perception on color and lighting is proposed. The method has very low computational complexity and low side effect quality. The corresponding real-time demo system is also presented.
In summary, this dissertation presents an efficient stereo matching hardware architecture which combined the tile-based BP with the fast message computation method for generation high quality depth map from stereo video. For 2D video to 3D video conversion, three kinds of algorithms are proposed. The algorithms generate depth from depth cues, prior hypothesis, and human depth perception. A demo system of 2D-to-3D conversion system that integrated with 3D vision kit is also implemented. The proposed 2D-to-3D conversion can not only produce high quality depth map for 2D video but also can achieve real-time processing in HDTV specification.
Subjects
3D Video, 2D-to-3D Conversion, VLSI Design, Signal Processing
Type
thesis
File(s)![Thumbnail Image]()
Loading...
Name
ntu-99-D94943010-1.pdf
Size
23.32 KB
Format
Adobe PDF
Checksum
(MD5):055b323ec8ba78b6abe02ef12ac10263
