指導教授:李建模臺灣大學:電子工程學研究所廖官榆Liao, Kuan-YuKuan-YuLiao2014-11-302018-07-102014-11-302018-07-102014http://ntur.lib.ntu.edu.tw//handle/246246/263965由於製程的演進,晶片測試面臨兩個重要的議題:複雜的缺陷表現行為以及自動測試向量產生器 (ATPG) 過於耗時。針對這兩個議題,我們提出利用圖形處理器 (GPU) 的高品質自動測試向量產生器的演算法。傳統利用中央處理器 (CPU) 的自動測試向量產生器通常仰賴非常快速的選擇,以及一次只針對一個錯誤產生一組測試向量。我們提出的自動測試向量產生器不同於傳統,可以同時針對多個錯誤以及多重的測試目標產生測試向量。我們所提出的方法實做了三個層面的平行化:裝置層面 (device-level) 的錯誤分區平行化、區塊層面 (block-level) 的電路分區平行化、以及字組層面 (word-level) 的搜尋分區平行化。這是一個可以同時產生上千個測試向量高度平行化的演算法。這個演算法的核心是一個「分裂成 W 個分身」 (Split-into-W-Clones) 的自動測試向量產生器。這個演算法把產生測試向量過程需要做的選擇轉換成平行的字組邏輯運算,而使得許多的選擇可以同時進行。我們更進一步利用圖形處理器來加速這個演算法。我們也提出三個延伸的應用來解決時序感知 (timing-aware) 與標準元件感知 (cell-aware) 相關的議題。實驗結果證明我們提出的演算法不論在品質、運算時間、以及測試向量長度大多優於目前業界的利用中央處理器的自動測試向量產生器。Due to the scaling of the manufacturing technology, two issues are critical in testing modern chips: 1) complex defect behavior; and 2) long automatic test pattern generator (ATPG) runtime. To deal with these issues, a graphical processing unit (GPU) based ATPG framework is proposed. Unlike central processing unit (CPU) based ATPG, which relies on fast serial decision making and generates one test pattern at a time, the proposed framework is capable of targeting multiple test objectives and multiple faults at the same time. This framework provides a completely new approach to the current test issues. The framework implements three levels of parallelisms: NewTerm{device}-level fault partitioning, NewTerm{block}-level circuit partitioning, and NewTerm{word}-level search space partitioning. The result is a massively paralleled algorithm which can generate thousands of patterns simultaneously. Such parallelism has not been achieved on traditional CPU-based ATPG. The core of the framework is the NewTerm{Split-into-W-Clones} (SWK) parallel ATPG algorithm, which can generate test patterns that meet multiple objectives. SWK uses NewTerm{random split} to convert decisions into parallel bitwise logic operations so that multiple objectives can be tried at the same time. A GPU-based massively parallel technique is then proposed to accelerate SWK algorithm. Three extensions are also prospoed based on the framework to deal with timing-aware and cell-aware issues. Results show that the framework provides higher quality, shorter test length, and shorter runtime compared with state-of-the-art CPU-based commercial ATPG.Contents 中文口試委員審定書 i 英文口試委員審定書 ii 中文摘要 iii Abstract iv 1 Introduction 1 1.1 Motivation 3 1.1.1 Test solutions for complex defect behaviors 4 1.1.2 ATPG runtime for targeting complex test metrics 7 1.2 Goal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.3 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.4 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2 Background 15 2.1 Test Quality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.2 ATPG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.2.1 Timing-Aware ATPG . . . . . . . . . . . . . . . . . . . . . . . 19 2.2.2 Timing-Unaware ATPG . . . . . . . . . . . . . . . . . . . . . . 20 2.2.3 Parallel ATPG . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.3 GPU Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 v 3 SWK ATPG Algorithm 23 3.1 SWK Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.1.1 SWK signal encoding . . . . . . . . . . . . . . . . . . . . . . . 24 3.1.2 Difference between SWK and traditional ATPG . . . . . . . . . . 25 3.2 SWK Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3.2.1 Propagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.2.2 Backtrace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.2.3 Assign Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.2.4 Backtrack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.2.5 Initialize Multiple Objectives . . . . . . . . . . . . . . . . . . . 43 3.2.6 Dynamic Test Compaction . . . . . . . . . . . . . . . . . . . . . 45 3.2.7 Static Test Compaction . . . . . . . . . . . . . . . . . . . . . . . 47 3.2.8 Complex Cells . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.4 Parallelism Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4 GPU Parallelization 56 4.1 GPU Porting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.1.1 GPU Memory Allocation . . . . . . . . . . . . . . . . . . . . . . 58 4.1.2 GPU Memory Access Optimization . . . . . . . . . . . . . . . . 59 4.2 Test Generation Kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 4.2.1 Word-Level Parallel Test Generation for a Block . . . . . . . . . 60 4.2.2 Block-Level Parallel Test Generation for a Kernel . . . . . . . . 63 4.2.3 Device-Level Parallel Test Generation for a CUT . . . . . . . . . 65 4.3 Fault Simulation Kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 4.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.5 Parallelism Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 vi 5 High Quality Test Generation 74 5.1 Test Selection for Small Delay Defects . . . . . . . . . . . . . . . . . . . 75 5.1.1 Fault Dropping Criterion . . . . . . . . . . . . . . . . . . . . . . 76 5.1.2 Upper and Lower Bound Analysis . . . . . . . . . . . . . . . . . 77 5.1.3 Build Dictionary . . . . . . . . . . . . . . . . . . . . . . . . . . 79 5.1.4 Greedy Selection . . . . . . . . . . . . . . . . . . . . . . . . . . 80 5.1.5 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . 81 5.1.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 5.2 Timing-Aware Test Generation . . . . . . . . . . . . . . . . . . . . . . . 83 5.2.1 Test Generation Kernel . . . . . . . . . . . . . . . . . . . . . . . 85 5.2.2 Fault Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . 89 5.2.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . 91 5.2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 5.3 Gate Exhaustive Transition Test Generation . . . . . . . . . . . . . . . . 92 5.3.1 GET Coverage and GET SDQL . . . . . . . . . . . . . . . . . . 95 5.3.2 Test Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 5.3.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . 98 5.3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 5.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 6 Conclusion 103 Reference 1054133937 bytesapplication/pdf論文公開時間:2014/07/29論文使用權限:同意有償授權(權利金給回饋學校)圖形處理器測試品質自動測試向量產生器平行計算利用圖形處理器高品質測試向量產生器GPU-Based High Quality ATPGthesishttp://ntur.lib.ntu.edu.tw/bitstream/246246/263965/1/ntu-103-F97943076-1.pdf