Parallelized Particle Filter Design for CUDA Based Computing Platforms
Date Issued
2010
Date
2010
Author(s)
Chao, Min-An
Abstract
Particle filtering is a sequential Monte Carlo (SMC) based method which outperforms traditional Kalman based filters in a wide range of real-worlds applications involving the nonlinear/non-Gaussian Bayesian estimation, such as target tracking in surveillance systems, recognition in robot vision, positioning, navigation, and so on. Due to its demand for a great deal of reconfigurability, fast prototyping, and online parallel signal processing, the emerging GPU platform called compute unified device architecture (CUDA) may be regarded as the most appealing platform for implementation. Since the CUDA based platform features the single-instruction multiple-thread (SIMT) execution model and the hierarchical memory model for fine-grained scalability, how to implement an efficient parallelized particle filter design on CUDA becomes an essential yet unsolved problem.
The objective of this thesis is to provide an efficient implementation method of parallelized particle filters on CUDA based computing platforms with conceptual and quantitative analysis. Based on the parallelization degree and data locality analysis, two design techniques, 1) finite-redraw importance-maximizing (FRIM) prior editing and 2) localized resampling, are proposed to conquer the bottleneck stage of the particle filtering, ie., the resampling stage, which involves data-dependent global operations. Since the characteristics of CUDA encourage the fast data-independent parallel computation rather than the slow global operations, the proposed techniques aim to reduce the time-consuming global operations with little overhead of additional local computation.
The implementation results not only validate the analysis on parallelization degree and data locality of particle filters, but also verify the tradeoff relationships between the reduction on global operations and the local computation overhead. By using the proposed techniques, particle filters can be implemented on CUDA based platforms with less sample sizes and less execution time. On the low- and middle-end CUDA-enabled platforms, NVIDIA GeForce 9400m and GTS250, the speedup brought by proposed techniques can reach 5.73 and 5.37 times, respectively, compared with the direct implementations on these platforms.
Subjects
Particle filter
Parallelized design
CUDA
GPGPU
Type
thesis
File(s)![Thumbnail Image]()
Loading...
Name
ntu-99-R97943028-1.pdf
Size
23.32 KB
Format
Adobe PDF
Checksum
(MD5):75dcfc67af1b99ef6da0c56843f6e1ae
