Ray-Pool Based Configurable Multiprocessor Design for Ray Tracing
Date Issued
2015
Date
2015
Author(s)
Hsieh, Ming-Lun
Abstract
Realistic image synthesis is now everywhere in our daily lives. People pursue new ex-perience of much more realistic of displays than traditional rasterization-based method. Ray tracing, as one of the physically-based rendering methods, is provided as a candi-date for next generation of image synthesis. However, it comes with problems about large scale computation and diversity to be conquered. In this thesis, a simple but com-plete hardware for real time ray tracing is proposed, trying to overcome these problem. Ray tracing requires additional computation of ray generation, traversal and inter-section test with objects. These rays may grow exponentially as each bounce of reflec-tion and refraction or even samples of specific distribution for path tracing. Meanwhile, there are hundreds of thousands of triangles to test. It starts the researches of acceler-ating data structures for them. On the other side, rays become more and more diverse as they traverse deeper. It is not compatible to the programming model of current GPGPUs. The divergence of data access in ray tracing causes the performance drop by cache evictions. Ray-pool based ray tracer (in short as RPRT) is then proposed. As a hardware ar-chitecture to solve these problem. Like previous work such as SGRT and PowerVR GR6500, RPRT separates the process of traversal and intersection to a specialized hardware while keeping the programmability in shader. RPRT tries to make these two computational units co-works efficiently by adding two pool between them. By the arrangement and virtualization of two pools, the two computational unit can work sim-ultaneously and independently. In this system, a configurable multiprocessor is designed as the need for shader. It is easy to configure and easy to extend for different applications with customized ALUs. It comes with a near cycle accurate simulator verified by hardware implementa-tion after logic synthesis. It works at the clock rate of 1GHz under the TSMC 40 nm process. By experiments on RPRT system, it can reaches 100Mrays/s of performance under the configuration of 4 compute units multiprocessor, 8 warps 8 cores in each CU, 108KB pools, and an ASIC traverser with ability of about 100Mtraversal/s. This result reveals the possibility of real-time hardware ray tracing at 720p with basic effects by primary rays and 480p by the traversal depth of 5.
Subjects
hardware architecture
GPU
3D computer graphic
ray tracing
Whitted style ray tracing
multiprocessor
Type
thesis
File(s)![Thumbnail Image]()
Loading...
Name
ntu-104-R02943024-1.pdf
Size
23.32 KB
Format
Adobe PDF
Checksum
(MD5):f7ba23797cb304566422b306bd8f2622
