Parallelized Particle Filter Design for CUDA Based Computing Platforms

Chao,  Min-An

Parallelized Particle Filter Design for CUDA Based Computing Platforms

Date Issued

2010

Date

2010

Author(s)

Chao, Min-An

URI

http://ntur.lib.ntu.edu.tw//handle/246246/256982

Abstract

Particle filtering is a sequential Monte Carlo (SMC) based method which outperforms traditional Kalman based filters in a wide range of real-worlds applications involving the nonlinear/non-Gaussian Bayesian estimation, such as target tracking in surveillance systems, recognition in robot vision, positioning, navigation, and so on. Due to its demand for a great deal of reconfigurability, fast prototyping, and online parallel signal processing, the emerging GPU platform called compute unified device architecture (CUDA) may be regarded as the most appealing platform for implementation. Since the CUDA based platform features the single-instruction multiple-thread (SIMT) execution model and the hierarchical memory model for fine-grained scalability, how to implement an efficient parallelized particle filter design on CUDA becomes an essential yet unsolved problem. The objective of this thesis is to provide an efficient implementation method of parallelized particle filters on CUDA based computing platforms with conceptual and quantitative analysis. Based on the parallelization degree and data locality analysis, two design techniques, 1) finite-redraw importance-maximizing (FRIM) prior editing and 2) localized resampling, are proposed to conquer the bottleneck stage of the particle filtering, ie., the resampling stage, which involves data-dependent global operations. Since the characteristics of CUDA encourage the fast data-independent parallel computation rather than the slow global operations, the proposed techniques aim to reduce the time-consuming global operations with little overhead of additional local computation. The implementation results not only validate the analysis on parallelization degree and data locality of particle filters, but also verify the tradeoff relationships between the reduction on global operations and the local computation overhead. By using the proposed techniques, particle filters can be implemented on CUDA based platforms with less sample sizes and less execution time. On the low- and middle-end CUDA-enabled platforms, NVIDIA GeForce 9400m and GTS250, the speedup brought by proposed techniques can reach 5.73 and 5.37 times, respectively, compared with the direct implementations on these platforms.

Subjects

Particle filter

Parallelized design

CUDA

GPGPU

Type

thesis

File(s)

Name

ntu-99-R97943028-1.pdf

Size

23.32 KB

Format

Adobe PDF

Checksum

(MD5):75dcfc67af1b99ef6da0c56843f6e1ae

Parallelized Particle Filter Design for CUDA Based Computing Platforms

關於 (About)

聯絡資訊 (Contact Us)

相關網站 (Useful Links)

關於開放取用 (Open Access, OA)

出版社期刊論文授權政策 (Copyright)

使用說明 (Instructions)

登入說明 (Sign-in)

匯入著作 (Submission)