A Data Parallel Approach to XML Parsing and Query
Date Issued
2011
Date
2011
Author(s)
You, Cheng-Han
Abstract
Data-parallel XML parsing has a crucial problem in partitioning XML documents. Existing approaches need a pre-parse step to determine the partitions. In this paper, we propose a direct parallel method to solve this problem without pre-parsing. In the direct parallel method, we directly start the parallel parsing by finding the “light tower”, which is a particular character with some exceptions, called clues. We handle the exceptions by watching the clues and reparsing the partition if it is required in the parsing stage. We also propose a non-synchronized splitter approach to the parallel XML querying using XPath expressions. In the non-synchronized splitter approach, we split an XPath expression into pieces to be executed by threads and we use a data structure, called the ancestor table, to help each thread handle its part of XPath expression independently without communications between threads. Our experiments show that our approach scales well from small sized files to huge sized files.
Subjects
XML parsing
XML querying
XPath
multi-core
parallel
VTD-XML
reparsing
direct parallel
Type
thesis
File(s)![Thumbnail Image]()
Loading...
Name
ntu-100-R98921069-1.pdf
Size
23.32 KB
Format
Adobe PDF
Checksum
(MD5):4df5acbc49e8d9cf7307f518bbb1b459
