陳少傑臺灣大學:電機工程學研究所簡振哲Jean, Cheng-ChoCheng-ChoJean2007-11-262018-07-062007-11-262018-07-062007http://ntur.lib.ntu.edu.tw//handle/246246/53177多媒體指令集在通用的處理器中是非常普遍的。它們是由一組短向量指令組成的。為了要完全的利用單一指令多重資料(SIMD)架構的效能,我們必須要產生高效率單一指令多重資料的程式,這樣單一指令多重資料的架構下才能完全的發揮其效益。本文主要在開發單一指令多重資料編譯器,有效地利用短向量指令集自動地產生高效率單一指令多重資料的程式之技術。當你做運算時,短向量單一指令多重資料的架構並不支援不連續的記憶體位址。所以,需要把資料重新排列。在此需要解決的問題就是如何把不連貫性的記憶體位址排列成連貫及有效的處理多種不同的資料類型。Multimedia extensions are nearly ubiquitous in today’s general-purpose processors. These extensions consist primarily of a set of short vector instructions that operate on same opcode to a vector of operands. To exploit the SIMD capabilities of these architectures, it has prompted the needs for generating efficient simdized codes that SIMD architectures can benefit from. This Thesis sets out to develop a vector SIMD compiler with techniques that target short vector instructions effectively and automatically. Operations on non-contiguous vector elements are not supported and they require explicit data realigning. Thus, one of the most common aspects of compilation is the effective management of memory alignment and coping with mixed data type. We identify several new challenges arisen in simdizing multimedia applications, and provide some solutions to these challenges. Simdizing such computation efficiently is therefore an ambitious challenge for compiler designer. We implemented an automatic simdization framework that supports effective simdization in the presence of control flow, memory misalignment, and mixed length conversion.ABSTRACT i LIST OF FIGURES vii LIST OF TABLES ix CHAPTER 1 INTRODUCTION 1 1.1 Motivation 1 1.2 Objectives 2 1.3 Organization of the Thesis 2 CHAPTER 2 BACKGROUND 3 2.1 Overview of a Basic Compiler 3 2.2 Lexical Analysis 5 2.3 Syntax Analysis 7 2.4 Semantic Analysis 7 2.5 Intermediate Representation 8 2.6 Machine-Independent and Dependents Optimizations 8 2.7 Code Generation 10 2.8 Overview of LCC Compiler Infrastructure 11 2.9 Overview of GCC Compiler Infrastructure 13 2.10 Overview of SUIF Compiler Infrastructure 14 2.11 Overview of Machine SUIF Compiler Infrastructure 19 2.12 Overview of Stream Shift Policy 22 2.12.1 Stream Shift Policy: Zero-Shift 22 2.12.2 Stream Shift Policy: Eager-Shift 23 2.12.3 Stream Shift Policy: Lazy-Shift 23 2.12.4 Stream Shift Policy: Dominant-Shift 24 CHAPTER 3 BACKGROUND ON PARALLELIZATION 25 3.1 Vector Parallelization 25 3.2 Loop Level Parallelization 26 3.3 SIMD Parallelization 27 3.4 Instruction Level Parallelization 27 CHAPTER 4 DEVELOPMENT OF A SIMD COMPILER 29 4.1 PLX Architecture Overview 29 4.2 Simdiziation Framework Overview 33 4.2.1 Control Flow Conversion 35 4.2.2 Loop Unrolling 37 4.2.3 Dataflow Optimization 37 4.2.4 Superword-Level Parallelism 38 4.2.5 Basic Block Level Aggregation 38 4.2.6 Short-Loop Aggregation 38 4.2.7 Loop-Level Aggregation 38 4.2.8 Alignment Devirtualization 39 4.2.8.1 SIMD Hardware Constraints 39 4.2.8.2 Data Reorganization Graph 41 4.2.9 Length Devirtualization 42 4.2.10 SIMD Code Generation 43 4.3 Example of Mixed Sources of SIMD Parallelism 43 4.4 Testing and Validation Methodologies 47 4.4.1 Background 47 4.4.2 Methodologies 48 CHAPTER 5 EXPERIMENTAL RESULTS AND DISCUSSION 51 5.1 Benchmarks 51 5.2 Evaluating Execution Cycles 52 5.3 Performance Gain Evaluation 56 CHAPTER 6 CONCLUSION 61 6.1 Future Work 61 REFERENCES 631152316 bytesapplication/pdfen-US單一指令多重資料編譯器多媒體指令集SIMDcompilerSUIF單一指令多重資料編譯器之開發Development of a SIMD Compilerthesishttp://ntur.lib.ntu.edu.tw/bitstream/246246/53177/1/ntu-96-R94921126-1.pdf