Chen, Yi-HauYi-HauChenChen, Tung-ChienTung-ChienChenSHAO-YI CHIENHuang, Yu-WenYu-WenHuangLIANG-GEE CHEN2009-02-252018-07-062009-02-252018-07-06200819398018https://www.scopus.com/inward/record.uri?eid=2-s2.0-53649084486&doi=10.1007%2fs11265-008-0213-7&partnerID=40&md5=73ea9b9b802ba45b795cbf040843e579http://scholars.lib.ntu.edu.tw/handle/123456789/340993http://ntur.lib.ntu.edu.tw/bitstream/246246/141480/1/68.pdfThe H.264/AVC Fractional Motion Estimation (FME) with rate-distortion constrained mode decision can improve the rate-distortion efficiency by 2-6 dB in peak signal-to-noise ratio. However, it comes with considerable computation complexity. Acceleration by dedicated hardware is a must for real-time applications. The main difficulty for FME hardware implementation is parallel processing under the constraint of the sequential flow and data dependency. We analyze seven inter-correlative loops extracted from FME procedure and provide decomposing methodologies to obtain efficient projection in hardware implementation. Two techniques, 4×4 block decomposition and efficiently vertical scheduling, are proposed to reuse data among the variable block size and to improve the hardware utilization. Besides, advanced architectures are designed to efficiently integrate the 6-taps 2D finite impulse response, residue generation, and 4×4 Hadamard transform into a fully pipelined architecture. This design is finally implemented and integrated into an H.264/AVC single chip encoder that supports realtime encoding of 720×480 30fps video with four reference frames at 81 MHz operation frequency with 405 K logic gates (41.9% area of the encoder). © 2008 Springer Science+Business Media, LLC.application/pdf987949 bytesapplication/pdfH.264/AVC; Motion estimation; Video coding; VLSI architectureBlock decomposition; Computation complexity; Data dependencies; Dedicated hardware; Finite-impulse response; Fractional motion estimation; Fully pipelined; H.264/AVC; Hadamard transform; Hardware implementations; Hardware utilization; Mode-decision; Operation frequencies; Parallel processing; Peak signal-to-noise ratio; Rate distortions; Real-time applications; Real-time encoding; Reference frames; Single-chip encoder; Variable block-size; Video coding; VLSI architecture; VLSI architecture design; Decomposition; Electric distortion; Hardware; Impulse response; Motion estimation; Motion Picture Experts Group standards; Signal distortion; Signal to noise ratio; Architectural designVLSI Architecture Design of Fractional Motion Estimation for H.264/AVCjournal article10.1007/s11265-008-0213-72-s2.0-53649084486http://ntur.lib.ntu.edu.tw/bitstream/246246/141480/1/68.pdf