Low-Power Architecture Design for MPEG-4 SP Encoder and H.264 Motion Estimation
Date Issued
2006
Date
2006
Author(s)
Lin, Chia-Ping
DOI
en-US
Abstract
Low power consumption is an important requirement on battery-limited systems, like
mobile devices. Many applications on the mobile devices require high computation
and high power consumption. Video encoding/decoding is one of these applications,
and it needs specific low-power design on algorithms and architectures to reduce the
power consumption. A video encoding system consists of different coding components,
which include motion estimation (ME), discrete cosine transform (DCT), inverse discrete
cosine transform (IDCT), entropy coding, and others depending on the standard.
They have different characteristics on computation, and different algorithms and architectures
are needed to achieve low-power requirement and maintain the performance at
the same time.
MPEG-4 is a video compression standard established since 1999. It has been widely
adopted for video compression until now. On mobile devices,MPEG-4 simple profile is
the popular standard because of its simplicity and good coding performance. It contains
basic but useful encoding components, like ME, DCT, IDCT, AC/DC prediction, and
variable length coding.
We analyze some key components of MPEG-4 SP encoder, like ME, DCT, and
IDCT, and develop suitable low-power algorithms and architectures for them. After
optimizing each modules, we integrate them and propose a low-power MPEG-4 SP encoder.
Power consumption of ME is reduced by fast algorithm and two dimensional
bandwidth sharing architecture. power consumption of DCT and IDCT is reduced by
content awareness. These algorithm can achieve much power reduction and maintain
tolerable coding performance. In circuit level, fine-grained leaf-based gated-clock technique
is widely applied on most registers in this design.
A 2-D data sharing architecture is proposed for ME design. To reduce computation
complexity, moving windows search with modified predictor scheme is adopted. It
can achieve computation reduction and degrade less than 0.05dB comparing with full
search. The final bandwidth requirement can be greatly reduced to 0.65% comparing
with full search without data sharing. AdaptiveDCT is proposed for content-aware computation.
It combines many low-power technique and solve the precision problem by
coefficient scaling, hybrid architecture, and proposed content classification algorithm.
The high probability of zero occurrence is exploited in IDCT and data transfer between
quantization (Q) and variable length coding (VLC). Our IDCT adopts previous design
proposed by Xanthopoulos [1] with coefficient scaling. It can achieve low-power characteristic
in zero computation. Zero marker scheme is proposed to avoid zero-valued
data transfer. Data recording of zero-valued data is implemented by registers. Therefore,
memory read/write operation of zero-valued data can be avoided. It can reduce
60% to 80% memory access between Q and VLC. Finally, the encoder chip is fabricated
under TSMC 0.18 µm CMOS 1P6M process. It contains 201K logic gate counts
and 4.56 KB SRAM. It supports CIF 30fps encoding with acceptable performance and
supports VGA 30fps as extended resolution. The post-layout gate-level power consumption
estimated by the Synopsys Prime Power are 5.9 mW in I-VOP endoing and
9.7 mW in P-VOP encoding at 1.8 V in CIF 30fps encoding. The real power estimation
of this chip is 2.5 mW in I-VOP encoding and 5 mW in P-VOP encoding at 1.3 V in
CIF 30fps encoding. It has much power reduction from previous works.
H.264 is the newest video compression standard developed by the Joint Video Team
(JVT). It can reduce 39%, 49%, and 64% of bit-rate comparing with MPEG-4, H.264,
and MPEG-2. Its excellent coding performance make it be widely adopted by commercial
applications including digital TV broadcasting, next-generation DVD, and network
streaming.
The excellent coding performance makes H.264 suitable for high resolution video
compression, but it also brings in huge computation overhead and consumes lots of hardware
resources and power. To solve this problem, we focus on integer motion estimation
(IME) part of H.264 encoder. It occupies most part of computation especially at high
resolution, like high definition DV (HDTV). we propose a hierarchical-based ME algorithm
which can reduce computation complexity to 0.45% from full search and improve
the coding performance. Corresponding architecture can processing block matching at
three different levels and support good data sharing scheme at each of them. These
makes it suitable for low-power H.264 encoder design for high resolution applications.
Subjects
MPEG-4編碼器
H.264移動估計
低功率
MPEG-4 encoder
H.264 IME
low power
Type
thesis
File(s)![Thumbnail Image]()
Loading...
Name
ntu-95-R93943018-1.pdf
Size
23.31 KB
Format
Adobe PDF
Checksum
(MD5):c1633510f7543fca5c26818d5dba4adb