Efficient String Matching Engines with Multiple-Character State Transitions
Date Issued
2014
Date
2014
Author(s)
Chen, Chien-Chi
Abstract
The algorithm of Aho and Corasick (AC-algorithm) can locate multiple keywords in a text efficiently; and thus it is widely used in the exact string matching. However, the AC-algorithm processes the text character by character with a performance limited by the processing clock. This thesis proposes novel string-matching architectures with multiple-character state transitions based on the AC-algorithm, which can inspect multiple characters in parallel to multiply the throughput. At first, three finite automaton (FA) approaches including Deterministic Finite Automaton (DFA), Nondeterministic Finite Automaton (NFA), and hybrid FA approaches are presented to implement the AC-algorithm. Subsequently, a derivation algorithm is proposed to derive these FAs to multi-character FAs to allow for the inspection of multiple characters in parallel. Additionally, the alignment problem, which occurs while multiple characters are being inspected in parallel, is solved by using assistant transitions. The derived multi-character FAs are evaluated on programmable devices and the evaluation results indicate that the throughput can be multiplied effectively. The achievable throughput of the best result is 21.4Gbps obtained by a 16-character AC-NFA implementation operating at a 167.36MHz clock.
Subjects
字串比對
確定性及非確定有限自動機
入侵偵測系統
Type
thesis
File(s)![Thumbnail Image]()
Loading...
Name
ntu-103-D93921012-1.pdf
Size
23.32 KB
Format
Adobe PDF
Checksum
(MD5):fa6a69bbb85c40b86032d6e74b049eed