Perception-Aware Video Encoder: Hardware Architecture Design of Bio-Inspired Human Eyes Perception Evaluation Engine for H.264 Video Encoder
Date Issued
2009
Date
2009
Author(s)
Wu, Tung-Hsing
Abstract
The existing multimedia has been affecting the life of human beings nowadays. Due to the limitation of computation complexity and transmission bandwidth, the data processing of the high quality video needs improvement. The newest video compression standard, H.264/AVC, offers tens of to hundreds of compression ratio. The final receiver of the video information is human eyes. However, the traditional standard only uses Peak Signal-to-Noise-Ratio (PSNR) as the quality index for compressed video bit stream. PSNR index does not consider the properties in human visual system (HVS). The bit allocation of the video bit stream is usually not optimized for the perception of human eyes. How to allocate the bit rate for different content of the video effectively within a limited bandwidth is important. With proper allocation of bits, such as more bits for important area in one frame and fewer bits for indifferent area, the bit rate can be reduced. In other words, the compressed video shows better perceptual quality compared with the other compressed video in the same bit rate. The key point in bit allocation is considering the human eye perception in HVS. But the system which can model properties in the human eye perception and reduce the bit rate of the video bit stream with offering the same perceptual quality often needs a huge computation complexity and system bandwidth. It cannot satisfy the real time requirements in video encoding systems. Therefore, we proposed a bio-inspired human eye perception evaluation algorithm, which can improve the functionality of bit allocation of video encoders, and we further proposed an efficient hardware architecture.he main target of this thesis is modeling the properties in HVS. One perception evaluation engine must analyze the content of current video frame data and determine the bit allocation for these data. We adopt and combine the structural similarity model, visual attention models, and the visual sensitivity models (including Just-Noticeable-Distortion (JND) model, and Contrast Sensitivity Function (CSF)) to get the weighting of importance of human eye perception for each macroblock (MB) in video frame via a proper fusion algorithm. Co-operating with H.264 video encoding system, we further developed the algorithm and system architecture which is suitable for hardware implementation to analyze the video content and then proposed a scheme to determine the quantization parameter in the encoding system. To save the system bandwidth, we employed the macroblock-based processing with Level-C data reuse scheme as our basic unit of processing flow, and parallel processing for the each hardware of the visual model.he proposed algorithm achieves better bit allocation for video coding systems by changing quantization parameters at MB level. With simulations of the cooperation of our proposed evaluation engine and the H.264 encoder in JM14.0 and subjective experiments, results show that our algorithm achieves about 5−40% bit-rate saving in the QP range of 24− 36 without perceptual (visual) quality degradation. For the hardware implementation of the proposed evaluation engine, the chip is taped out using TSMC0.18μm technology. The chip size is about 3.3×3.3mm2, and the power consumption is 83.9mW. The processing capability is HDTV720p (1280×720).
Subjects
Video encoder
H.264
Inter
Intra
Human visual system
Attention model
Perceptual model
SSIM
JND
Motion
Contrast sensitivity function
Contrast
Type
thesis
File(s)![Thumbnail Image]()
Loading...
Name
ntu-98-R96943006-1.pdf
Size
23.32 KB
Format
Adobe PDF
Checksum
(MD5):e738c13cefd3f8a42c8308c373e7287a
