Hardware Architecture Design and Implementation of Universal Vertex/Pixel Shader for 3D Graphics System
Date Issued
2007
Date
2007
Author(s)
Lin, Yu-Cheng
DOI
en-US
Abstract
3D graphics technology, which is developed since 1960s, is widely used in animations, games, and user interfaces. For real-time graphics applications, Graphics Processing Units (GPUs) are now mainly designed for the desk-top environments.
In recent years, there are two important migrations in graphics accelerators. The first one is that the fixed-function pipeline in the early days is now gradually replaced by the programmable pipeline, shader pipeline. The shader pipeline provides the artists and programmers freedom to program the GPU, and extraordinary graphic effects are emerging in an endless stream. The second important migration is that graphics accelerators for mobile devices become more and more important. Powerful graphics functions are going to be integrated in hand-held devices to provide users better user interface and portable gaming environments.
The limited resources on a mobile devices, including hardware resource and energy resource, cause the major drawback to provide 3D graphic capability on the handheld devices. Several low-power low-cost solutions have been proposed in
these years with low performance. A more efficient solution, where the computing, memory, and power resources should be effectively allocated, is still required.
In this thesis, low-power cost-efficient yet high performance universal vertex/pixel shaders, which are used to replace the vertex shader and the pixel shader in the traditional programmable pipeline, are proposed. There are three major contributions in hardware architecture in this thesis. First, the universal vertex/pixel shader, which unifies the functions of the vertex shader as well as the pixel shader
and has the ability to make adaptive execution-time resource allocation based on the different scenarios, is proposed to solve the load-imbalance problems. Second, the configurable memory array (CMA) can be used as input/output vertex cache and can change the configurations dynamically to keep the memory usage efficiently for different applications. Finally, many low power design techniques are also proposed. The main low power techniques applied are early rejection after transformation (ERAT) and gated clock. The ERAT technique analyzes the contents of transformed primitives to avoid redundant lighting computation in order to reduce power consumption of the shaders. Instruction level gated clock can
be achieved from the operation (OP) and the active vector codes. The clock of those data registers of the un-issued PEs are gated for saving dynamic power. The unused vector pipeline would be turned off and gated to save power.
The proposed design techniques are verified by real implementation. Implementation results show that over 40 percent processing time could be saved with all the architecture advantages mentioned above. The prototype chip is fabricated by UMC 90nm technology. The die size is 3.500×3.500mm2. It is capable of processing 200 mega vertices per second and 200 mega pixels per second, which is
equivalent to 6.4 giga floating point operations per second. The power consumption is 10.75mW in the worst case when the chip works at 200MHz.
Subjects
頂點與像素通用著色處理器
電腦圖學
可程式化處理器
universal vertex/pixel shader
graphics
programmable shader
Type
thesis
File(s)![Thumbnail Image]()
Loading...
Name
ntu-96-R94943020-1.pdf
Size
23.31 KB
Format
Adobe PDF
Checksum
(MD5):feef681cac0ac71892e00efe1ae64b5e
