Efficient Classification for Mining Concept-Drifting Data Streams

Lee, Yi-yao

Efficient Classification for Mining Concept-Drifting Data Streams

Date Issued

2005

Date

2005

Author(s)

Lee, Yi-yao

DOI

en-US

URI

http://ntur.lib.ntu.edu.tw//handle/246246/53585

Abstract

We devise in this thesis a concept-drift-driven classification algorithm, called SODA(Speedy Concept-Drift Detection Algorithm) to mine data streams with concept drift. SODA is an on-line incremental learning algorithm which is able to keep its model consistent with new concepts and to process each example in constant time. The contributions of the algorithm SODA are many folds. We address the problem of detecting concept drifts by inspecting the distribution of one attribute which is most discriminative to target class. The SODA algorithm is capable of capturing concept drifts in data streams efficiently, and looks after execution performance and accuracy of classifiers. From the empirical studies in Section 4, by applying the efficient split checking method, the concept drift detection with statistical analysis, and the effective alternative tree selection strategy, algorithm SODA outperforms prior works in terms of execution efficiency, performance of detecting concept drifts, and economic usage of memory. Thus, the concepts in data streams can be captured and learned efficiently. Therefore, SODA algorithm is able to strike a balance between the memory usage and accuracy of the classifier in data streams.

Subjects

串流資料

概念遞移

決策樹

Data Streams

Concept Drfit

Classification

Data Mining

Type

thesis

Efficient Classification for Mining Concept-Drifting Data Streams

關於 (About)

聯絡資訊 (Contact Us)

相關網站 (Useful Links)

關於開放取用 (Open Access, OA)

出版社期刊論文授權政策 (Copyright)

使用說明 (Instructions)

登入說明 (Sign-in)

匯入著作 (Submission)