簡韶逸臺灣大學:電子工程學研究所李權祐Lee, Chuan-YiuChuan-YiuLee2007-11-272018-07-102007-11-272018-07-102007http://ntur.lib.ntu.edu.tw//handle/246246/57268光線追跡是一種簡單卻又極具威力的三維繪圖技術。它利用模擬真實光線行進的方式來描繪出極具真實的畫面,但隨之而來的卻是計算上的高複雜度,也因此無法達到即時的需求。近幾年來,隨著硬體運算能力及演算法的進步,許多軟體實現的光線追跡器以大幅提升其效能到接近即時的程度。但是仍然只有極少數硬體實現的光線追跡器被提出。其原因主要是在於傳統的光線追跡管線的演算法並不利於硬體上的設計。本論文提出一個新的光線追跡演算法。它的特點是相對傳統的演算法而言能夠大量減少晶片內記憶體的使用,卻又能保持相似的效能。 同時,根據此演算法,我們也提出相對應的硬體架構。在此架構裡,我們還使用了多種硬體架構設計的方法來提升其硬體使用率,包括多執行緒、摺疊等,使其能在有限的硬體資源下達到最高的效能。在外部記憶體頻寬方面,運用位元長度分析和頂點分享的技術來減少頻寬。 我們也按照標準單元設計流程實作原形晶片。該晶片利用台積電130 nm技術製程,面積為1。697 X 1。7mm^2。其處理能力為每秒43億浮點數指令運算。Ray tracing is a simple yet powerful and general algorithm for accurately computing global light transport and rendering high quality images. While recent algorithmic improvements and optimized parallel software implementations have increased ray tracing performance to interactive levels, few efficient hardware solution has been available due to hardware unfriendly of traditional ray tracing algorithm. This thesis proposes a more hardware friendly ray tracing algorithm and describes the architecture based on this algorithm. We also implement a first prototype chip around the world for ray tracing with standard cell based design flow. By the proposed algorithm, on-chip sram usage of my design is reduced dramatically compared to previous architectures while it retains a similar computation amounts. We also use multi-threading and folding technique to increase the hardware utilization and achieve maximum performance at minimum hardware resource. The external bandwidth is low enough to duplicate many the same units to process in parallel, which is achieved by a tiny cache with word length analysis and vertex sharing technique. The prototype chip is fabricated by TSMC 0.13 μm technology. The chip size is 1.697×1.7mm2. It is capable of 4.3 giga floating point operations per-second. viiAbstract vii 1 Introduction 1 1.1 Why is Ray Tracing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 What is Ray Tracing . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2.1 Ray Generation . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2.2 Traversal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2.3 Intersection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2.4 Shading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2 PreviousWorks 7 2.1 Algorithm Progress in Interactive Ray Tracing . . . . . . . . . . . . . 7 2.2 Parallel Platform for Ray Tracing . . . . . . . . . . . . . . . . . . . . . 8 3 Algorithm Development 9 3.1 Traversal Algorithm Overview . . . . . . . . . . . . . . . . . . . . . . 9 3.1.1 Uniform Grids . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.1.2 KD-tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.1.3 BVH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.2 Bounding Volume Hierarchy Algorithm In Ray Tracing . . . . . . . . . 12 3.2.1 Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.2.2 AABB Intersection Test . . . . . . . . . . . . . . . . . . . . . 14 3.3 New Traversal Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.3.1 First Hit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 i ii 3.3.2 Frustum Culling . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.3.3 Change The Ray Order When Both Tests Are Fail . . . . . . . . 19 3.3.4 Node Stack Elimination . . . . . . . . . . . . . . . . . . . . . 19 3.3.5 Record First Active Ray . . . . . . . . . . . . . . . . . . . . . 20 3.3.6 Traversal at Leaf . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.4 Experimental Results and Comparison . . . . . . . . . . . . . . . . . . 22 3.5 Ray-Triangle Intersection . . . . . . . . . . . . . . . . . . . . . . . . 26 3.5.1 Algorithm Adoption . . . . . . . . . . . . . . . . . . . . . . . 26 3.5.2 Fast, Minimum Storage Ray Triangle Intersection . . . . . . . . 27 4 Architecture Design of Ray Tracer 29 4.1 Architecture Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 29 4.2 The Ray Casting Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.3 Traversal Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.4 Traversal Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 4.5 Intersection Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 4.6 Ray Buffer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.7 Arbiter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.8 Result Buffer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 4.9 Cache Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 5 Chip Implementation 53 5.1 Design Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 5.2 Test Consideration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 5.2.1 Ad-hoc Testing . . . . . . . . . . . . . . . . . . . . . . . . . . 56 5.2.2 SRAM BIST . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 5.3 Chip Layout and Specification . . . . . . . . . . . . . . . . . . . . . . 57 5.4 Experimental Results and Comparison . . . . . . . . . . . . . . . . . . 57 6 Conclusion 63989988 bytesapplication/pdfen-US光線追跡硬體架構三維繪圖Ray TracingHardware Architecture3D Graphics使用層次包圍盒之光線三角形相交測試之硬體架構設計與實現Hardware Architecture Design and Implementation of Ray-Triangle Intersection with Bounding Volume Hierarchiesthesishttp://ntur.lib.ntu.edu.tw/bitstream/246246/57268/1/ntu-96-R94943110-1.pdf