Free-viewpoint 3DTV: Algorithm and Architecture Design
Date Issued
2012
Date
2012
Author(s)
Tsung, Pei-Kuei
Abstract
3DTV is the promising mainstream of the next-generation TV systems. High-resolution 3DTV provides users vivid watching experience. Moreover, free-viewpoint view synthesis (FVVS) extends the common two-view stereo 3D vision into the virtual reality by generating unlimited views on any desired viewpoint. In this dissertation, the algorithms and VLSI architecture designs in the free-viewpoint 3DTV system are introduced in three parts: the visual quality improvement, the system analysis and implementation of the 3DTV coding system, and the system integration of 3DTV set-top box SoC. In the visual quality improvement part, different algorithms are proposed to solve the perception issues on the free-viewpoint virtual view synthesis. In the 3DTV coding system part, the system analysis and the VLSI architecture design on the MVC encoder is introduced. By the proposed MVC encoder, the 4096 × 2160p H.264/AVC and the HDTV MVC real time encoding is achieved. Finally, the free-viewpoint 3DTV system is integrated in the worldwide first single chip free-viewpoint 3DTV set-top box SoC, including the MVC decoder and free-viewpoint synthesizer. The 216fps 4096 × 2160p throughput enables 9 possible view angles real-time displayed in parallel.
In the first part of this dissertation, the visual quality improvement algorithms in FVVS are introduced. In order to provide better free-viewpoint video quality, the visual quality improvement algorithms are designed in the inter-view color calibration, virtual view synthesis, and post processing blocks. In order to reduce the computational complexity and the complex scheduling of FVVS, a single iteration view interpolation algorithm is proposed. The redundant computation is reduced by 86% after the single iteration scheme. Further, the artifacts due to the imperfect depth map are eliminated by the proposed running interpolation and background erosion under the same single iteration. Then, a hybrid color compensation scheme is proposed as the pre-processing. Based on the inter-view color correspondence estimation, a linear and smooth light field model is established. As the result, color mismatch and ghost effects in the synthesized virtual view frames are eliminated. Besides, the regions with strong reflection are detected from the outliers in linear regression and optimized by hybrid reflection model. Thus, a proper reflection behavior is shown. Compared with virtual views without color compensation, the proposed method improves the PSNR result by about 0.26-0.42 dB. After the pre-processing engine and the view synthesis engine, a hybrid inpainting algorithm is presented as the post processing engine. The motion-oriented, depth-based, and conventional anisotropic filter diffusion manners are used to aim for the better visual quality. The appropriate solution is found with good variety for dealing with different types of image artifacts. The simulation results show that the proposed hybrid inpainting algorithm outperforms by both perceptual quality and the objective metric measure. Finally, a real-time viewpoint-aware 3D video synthesis system is developed in the end of the first part of this dissertation. After the GPU-CPU co-optimization, 1280 × 720p and 30fps throughput is achieved on a 4-core notebook.
The algorithm and architecture design for the the 3DTV coding system is proposed in the second part of this dissertation. At first, a new bandwidth analysis scheme for various MVC structures is proposed. The concept of precedence constraint in the graph theory is adopted to derive the processing order in a MVC structure. In addition, two scheduling flows in MVC are proposed for systematical analysis. With the combination of the level-C+ data reuse scheme, several design points can be derived. Hardware resource allocation can be systematical defined with the trade-off between the system memory bandwidth and the on-chip memory. Then, toward the MVC encoder design, several issues about video encoder design for 3DTV applications are discussed in this part. The system analysis shows that the previous design methods used in the single video coding cause a dramatic hardware resource requirement and cannot be employed directly. In order to deal with these design challenges, solutions for each module in the MVC encoder, including cache-based and predictor-centered IMDE, hybrid open-close loop intra prediction, and FPPDD CABAC, are proposed. After adopting all the proposed algorithm and architecture optimizations, an MVC single chip encoder is implemented under TSMC 90nm process. The proposed MVC encoder design supports from the 1920 × 1080p full HD three views to 1280 × 720 HDTV seven views HD MVC real-time encoding. Furthermore, the single view 4096 × 2160p QFHD H.264/AVC encoding is also supported. With the proposed VLSI techniques, real-time 3D video applications become feasible.
The third part of this dissertation introduces the SoC integration of the worldwide first free-viewpoint 3DTV set-top box SoC. In order to reduce the hardware complexity in the warping engine, which is the key module of the SoC, a new 3D warping engine model is presented. We show that the rationality and low-cost characteristic for linear interpolation approach (LIA) are suitable for hardware design. In addition, the redundant information for fractional bits of parameters are further reduced by the precision fitting scheme. By doing so, 95.9 % and 69.5 % of area are saved for the Homographic matrix rendering and vector transform stage, with the negligible 0.0059 dB overhead of PSNR. In the system scheduling level, a hardware oriented 6D FVVS flow is proposed. Through the proposed texture reorder and the corresponding inverse reorder scheme, the on-chip data scheduling can be fitted the conventional block pipelining even under different viewpoints and geometries. Then, the DWRFS scheduling and the on-chip texture buffer optimization deal with the frame-level and DRAM word-level bandwidth saving respectively. After integrating all these technologies, about 95.7 % system memory bandwidth is saved. Moreover, instead of the conventional line-buffer based data scheduling and the corresponding horizontal-shift-only geometry, the proposed hardware-oriented 6D FVVS flow supports full 6D free-viewpoint geometries. Finally, after the MVC decoder integration and other VLSI architecture contribution, the proposed free-viewpoint 3DTV set-top box SoC is realized under TSMC 40nm technology. A MVC decoder and a free-viewpoint synthesizer are integrated together to support various free-viewpoint 3DTV applications. By the FVVS engine, users can explore the 3D scenes in the virtual reality with any desired directions and positions rather than only seeing the stereo 3DTV by horizontal-shift view angles. The 6D FVVS flow overcomes the view angle limitation on the state-of-the-art 3DTV chips and supports full geometries including 3-dimensional rotation and 3-dimensional translation. The 216fps 4096 × 2160p throughput of the proposed 3DTV set-top box SoC can be used in real-time displaying 9 different virtual views in parallel. Comparing with the state-of-the-art 3DTV chips, a 9× to 40.5× higher throughput is presented. After the proposed DWRFS scheme and texture reorder cache design, 93 % of the system memory bandwidth is saved. Furthermore, with the aid of the advanced CMOS technology from TSMC 40nm, 27.5MPixels/mW power efficiency, which is 6.6× to 229× higher than the state-of-the-art 3DTV chips, is achieved.
In brief, the 3D technologies provided in this dissertation lead the way to the possible next generation free-viewpoint 3DTV system. Another step toward the human dream of the "reality" is achieved by the research contributions in this dissertation. We sincerely hope that these research contributions can create a new era for digital multimedia life.
Subjects
System-on-a-Chip (SoC)
Video
Compression
3DTV
virtual reality
Codec
Parallel archi-tecture
Low Power
Multiview Video Coding
MPEG, AVC
Type
thesis
File(s)![Thumbnail Image]()
Loading...
Name
ntu-101-D97943001-1.pdf
Size
23.32 KB
Format
Adobe PDF
Checksum
(MD5):94c6d6f052b4cdaa2b76f9aae65949c3
