Memory Access Optimization for Data-parallel GPU Architectures

Wei, Ming-Feng

Memory Access Optimization for Data-parallel GPU Architectures

Date Issued

2016

Date

2016

Author(s)

Wei, Ming-Feng

URI

http://ntur.lib.ntu.edu.tw//handle/246246/276258

Abstract

Global memory accesses always cause the latency with hundreds of cycles, so that the performance of heterogeneous applications might degrade significantly if global memory accesses increase. In this thesis, we present a mathematical modeling that captures the memory accessing to the public within a group of threads and a metric identifying the degree of inefficient serial accesses in the GPU memory system. Based on the analysis of serial accesses in the memory system caused by global memory accessing within a work-group and among work-groups, we propose an approach to the memory access problem in GPUs. Evaluation on various kernel functions shows that kernels running with the work-group size suggested by our methodology outperforms the work-group size provided by hardware vendors. Heterogeneous applications executing on GPUs can gain the better performance without any code modification except by the memory access optimization with work-group sizing as suggested by our methodology.

Subjects

Memory Access Optimization

Type

thesis

File(s)

Name

ntu-105-R02921043-1.pdf

Size

23.32 KB

Format

Adobe PDF

Checksum

(MD5):c4fd326360b34eebc62cc6df3b4f3f34

Memory Access Optimization for Data-parallel GPU Architectures

關於 (About)

聯絡資訊 (Contact Us)

相關網站 (Useful Links)

關於開放取用 (Open Access, OA)

出版社期刊論文授權政策 (Copyright)

使用說明 (Instructions)

登入說明 (Sign-in)

匯入著作 (Submission)