Inexact and Mixed Precision Eigenvalue Solvers on GPU
Date Issued
2014
Date
2014
Author(s)
Huang, Jhih-Ming
Abstract
Eigenvalue problem is one of the most crucial topics in engineering and science fields nowaday. In practice applications, the target matrix is usually large and sparse, hence solving the eigenvalue problems need huge computa- tion amount. The high efficiency is a strong demand in practice, therefore High Performance Computing, HPC, plays an important role in this topic. One important approach for getting higher performance is mixed precision design, which means it will change the operation precision during the com- putation without dropping the finial accuracy. Since single precision requires less memory storage and it may cause higher cache hit ratio, which may affect performance a lot. In addition, in some numerical operation, single precision is faster than double precision. Hence, if the original algorithm is accuracy insensitive, which means that it could lost some accuracy during the compu- tation and keep the same final accuracy, then it is suitable to be redesigned as a mixed precision type algorithm to enhance the performance. The eigen- solver we focus on exactly belongs to this type. Shift-Invert Residual Arnoldi, SIRA, algorithm is an well-known eigenvalue solver, which consists of an in- ner loop and an outer loop. The inner loop is solving a linear system, which is for searching the correction direction to help outer loop find the desired eigen-pair. The efficiency of SIRA relies on the solutions of the inner-loop linear systems. These systems can be solved in lower accuracy without down- grading the final accuracy of the target eigenvalues. By taking advantage of this algorithmic feature and the computational power of GPU, we develop a mixed precision eigensolver in this research. We develop a method called pocket method, it adaptively choosing the double or single precision to solve the linear system. Moreover, in solving the linear system, it automatically adjust the inner tolerance and timing of exiting inner loop. Pocket method has the best performance in most of our experiments.
Subjects
特徵值問題
圖形處理器
混合精度
Type
thesis
File(s)![Thumbnail Image]()
Loading...
Name
ntu-103-R01221022-1.pdf
Size
23.54 KB
Format
Adobe PDF
Checksum
(MD5):27c8ad796bf3fee124ee9556192de83f
