Options
Hardware Architecture Design and Implementation of Ray-Triangle Intersection with Bounding Volume Hierarchies
Date Issued
2007
Date
2007
Author(s)
Lee, Chuan-Yiu
DOI
en-US
Abstract
Ray tracing is a simple yet powerful and general algorithm for accurately computing
global light transport and rendering high quality images. While recent algorithmic improvements
and optimized parallel software implementations have increased ray tracing
performance to interactive levels, few efficient hardware solution has been available due
to hardware unfriendly of traditional ray tracing algorithm. This thesis proposes a more
hardware friendly ray tracing algorithm and describes the architecture based on this algorithm.
We also implement a first prototype chip around the world for ray tracing
with standard cell based design flow. By the proposed algorithm, on-chip sram usage of
my design is reduced dramatically compared to previous architectures while it retains
a similar computation amounts. We also use multi-threading and folding technique to
increase the hardware utilization and achieve maximum performance at minimum hardware
resource. The external bandwidth is low enough to duplicate many the same units
to process in parallel, which is achieved by a tiny cache with word length analysis and
vertex sharing technique. The prototype chip is fabricated by TSMC 0.13 μm technology.
The chip size is 1.697×1.7mm2. It is capable of 4.3 giga floating point operations
per-second.
vii
global light transport and rendering high quality images. While recent algorithmic improvements
and optimized parallel software implementations have increased ray tracing
performance to interactive levels, few efficient hardware solution has been available due
to hardware unfriendly of traditional ray tracing algorithm. This thesis proposes a more
hardware friendly ray tracing algorithm and describes the architecture based on this algorithm.
We also implement a first prototype chip around the world for ray tracing
with standard cell based design flow. By the proposed algorithm, on-chip sram usage of
my design is reduced dramatically compared to previous architectures while it retains
a similar computation amounts. We also use multi-threading and folding technique to
increase the hardware utilization and achieve maximum performance at minimum hardware
resource. The external bandwidth is low enough to duplicate many the same units
to process in parallel, which is achieved by a tiny cache with word length analysis and
vertex sharing technique. The prototype chip is fabricated by TSMC 0.13 μm technology.
The chip size is 1.697×1.7mm2. It is capable of 4.3 giga floating point operations
per-second.
vii
Subjects
光線追跡
硬體架構
三維繪圖
Ray Tracing
Hardware Architecture
3D Graphics
Type
thesis
File(s)
No Thumbnail Available
Name
ntu-96-R94943110-1.pdf
Size
23.31 KB
Format
Adobe PDF
Checksum
(MD5):873a7ba0d650dcc7464b6d69085efbdc