人眼感知可調節品質之H.264視訊編碼器系統設計

臺灣大學: 電子工程學研究所簡韶逸傅昱絜Fu, Yu-JieYu-JieFu2013-04-102018-07-102013-04-102018-07-102011http://ntur.lib.ntu.edu.tw//handle/246246/256760隨著視訊壓縮標準的發展從MPEG1、MPEG2、H.263到H.264，視訊壓縮的效率不斷地進步。目前的視訊壓縮標準H.264/AVC可提供數十到數百壓縮比率，並且跟前一代相比壓縮效率提高了一大步。儘管如此，最後接收並觀看這些解壓縮回來的影片資訊還是我們人。視訊壓縮標準只用了像是差值絕對值和(sum of absolute difference, SAD)或是差值平方和(sum of squared difference, SSD)來當成壓縮視訊影像的品質指標，但這些品質指標卻無法和我們的人眼感知(human perception)有很好的關聯性。因此視訊壓縮的位元分配也就沒有對人眼感知做最佳化的處理。使用適當的位元分配，例如在畫面中重要的區域或是失真較多的區域分配到更多的位元率，可以讓整體的視覺品質提升。在本篇論文中，我們發展了一套人眼感知可調節品質之H.264視訊編碼器系統。分析重建區塊(macroblock, MB)以及從模式選擇來的最佳預測區塊的關係，我們提出了預測式的量化參數(quantization parameter, QP)評估方法用來調節視訊品質根據一個事先定義好的感知品質。我們也提出了一自動的品質調整機制來達到更好的位元預算的使用。除此之外，有了顯著物件偵測(salient object detection)的幫助，我們可以進一步地提升人眼會注意的區域的視覺品質。我們提出的演算法藉由改變每個區塊的量化參數來達到視訊編碼系統中更好的位元分配。與H.264視訊編碼系統的參考軟體(Reference Software) JM14.0相比較，我們可以達到比較好且穩定的視訊品質。針對硬體實作，我們提出了顯著物件偵測引擎(salient object detection engine)可應用於多種用途。我們的顯著物件偵測引擎除了可以應用在視訊壓縮以外，也能應用在物件辨識、物件切割等等的應用上。我們的設計使用了TSMC90nm的技術製程，處理能力的視訊解析度為HDTV1080p(1920×1080)。With the development of video coding standard from MPEG-1, MPEG-2, H.263 to H.264/AVC, the coding efficiency improves step by step. The video coding standard, H.264/AVC, offers tens of to hundreds of compression ratio and has improved the coding efficiency a lot better than before. However, the final receiver of the video information is human. The video coding standard only uses SAD (sum of absolute difference) or SSD (sum of square difference) as the quality metrics which are poorly correlated with human perception. Thus the bit allocation of the video bit stream is usually not utilized efficiently for the human perception. With the proper allocation of bits, such as more bits for more important or more distorted region, the total quality can be improved. In this work, we develop a system of perceptual quality-regulable H.264 video encoder. Exploiting the relationship between the reconstructed macroblock and its best predicted macroblock from mode decision, a novel predictive quantization parameter estimation method is built and used to regulate the video quality according to a predefined perceptual quality. An automatic scheme of quality refinement is also developed to a better usage of bit budget. Moreover, with the aid of salient object detection, we further improve the quality on where human might focus on. The proposed algorithm achieves better bit allocation for video coding system by changing quantization parameters at macroblock level. Compared to JM reference software with macroblock layer rate control, our algorithm achieves better and more stable quality by the higher average SSIM index and smaller SSIM variation. For hardware implementation, We propose a salient object detection hardware engine since the salient object detection can be used not only in video coding but also in many other applications such as automatic image cropping, adaptive image display in small devices, object recognition, and tracking. The design is implemented with TSMC90nm technology. The processing capability is HDTV1080p(1920x1080) with 30 frame per second.3095280 bytesapplication/pdfen-US人眼感知編碼h.264視訊編碼器品質可調節的perceptual codingh.264 video encoderquality regulable人眼感知可調節品質之H.264視訊編碼器系統設計System Design of Perceptual Quality-Regulable H.264 Video Encoderthesishttp://ntur.lib.ntu.edu.tw/bitstream/246246/256760/1/ntu-100-R98943015-1.pdf