陳銘憲臺灣大學:電機工程學研究所羅仕翔Lo, Shih-HsiangShih-HsiangLo2007-11-262018-07-062007-11-262018-07-062007http://ntur.lib.ntu.edu.tw//handle/246246/53109A pervasive web application is a server providingmany web services for its registered users. Nowadays, three of basic services that a typical pervasive web application offers are membership management, search service and map-enabled photo service. In this thesis, we design a data mining framework composed of three different data mining techniques to improve the performance of three services. In order to improve the performance of membership management, in the second chapter, we develop a categorical decision tree classifier to classify users efficiently. It noted that the data of user profiles has an unique phenomenon. Its characteristic is that few attributes of user profiles have higher information gains to distinguish users. By exploiting this characteristic that a traditional decision tree classifier does not consider, our designed classifier can reduce the execution time in generating a decision tree for user classification. As a result, the decision tree generated by our classifier can identify users efficiently for special marketing needs of an advertisement. For the improvement of a search service, in the third chapter, we propose a sequential web search algorithm that leverages the sequential queries issued by users to search the required information. Compared with previous works, our approach uses the additional feedback data on result pages of sequential queries where prior works only use feedback data of a query. Thus, our approach can provide a better ranking of result pages for sequential queries. For the efficiency of retrieving geotagged photos, in the fourth chapter, we design a clustering algorithm that incrementally clusters geotagged photos in accordance to thresholds of different scales. Compared with other applications, we show the photo clusters instead of all photos where the number of photo clusters is much less than that of all photos. As a result, the performance of map-enabled photo service is improved efficiently.1 Introduction 1 1.1 Motivation . . . . . . . . . . . . . 1 1.2 Overview of the Dissertation . . . . 3 1.2.1 Inference Based Classifier: Efficient Construction of Decision Trees for Sparse Categorical Attributes . . . . . . . . . 3 1.2.2 Effective Sequential Web Search with Personal Page Eigenvectors . . . . . . . 4 1.2.3 Geotagged Photos Clustering Algorithm for of aMap-Enabled PhotoWeb Service 5 1.3 Organization of the Dissertation . . . . . 5 2 Inference Based Classifier: Efficient Construction of Decision Trees for Sparse Categorical Attributes 6 2.1 Introduction . . . .. . . . . . . . . . . . 6 2.2 Preliminaries . . . . . . . . . . . . . . . 9 2.3 Inference Based Classifier . . . . . . . . 10 2.3.1 Algorithm of IBC . . . . . . . . . . . . 10 2.4 PerformanceStudies . . . . . .. . . . . . 14 2.4.1 Real-life Datasets . . . . .. . . . 15 2.4.2 Experiment One: Classification Accuracy . . . . . . . 15 2.4.3 Experiment Two: Execution Time in Scale-Up Experiments for data set of sparse categorical attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3 Effective SequentialWeb Search with Personal Page Eigenvectors 19 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.2 ProblemStatement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.3 Incremental Personal HITS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.3.1 ReviewofHITS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.3.2 IPHITSAlgorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.4 SystemFramework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.4.1 Design of a Search Proxy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.4.2 Design of Feedback Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.4.3 Design of Ranking Refinement . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.5 Experimental Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.5.1 System Architecture of TOP Platform . . . . . . . . . . . . . . . . . . . . . . 32 3.6 Experimental Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 4 Geotagged Photos Clustering Algorithm for of a Map-Enabled PhotoWeb Service 40 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 4.1.1 ProposedFrameworkof aMap-EnabledPhotoWebService . . . . . . . . . . 46 4.2 RelatedWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.3 ProblemStatement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 4.4 Geotagged Photo Clustering Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 49 4.4.1 UsageScenario1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 4.4.2 UsageScenario2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 4.4.3 Design of the Client Program . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.4.4 Design of the Server Program . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.4.5 Design of an Incremental Framework . . . . . . . . . . . . . . . . . . . . . . 52 4.4.6 photo Clustering Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.4.7 Design of User Clustering Algorithm . . . . . . . . . . . . . . . . . . . . . . 52 4.4.8 Design of a Map Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.5 Experimential System Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.5.1 The Architecture of a Client-SideProgram . . . . . . . . . . . . . . . . . . . 54 4.5.2 The Architecture of a Server-SideProgram . . . . . . . . . . . . . . . . . . . 54 4.6 Scenario of Data Synchronous Mechanism . . . . . . . . . . . . . . . . . . . . . . . . 54 4.6.1 On-Line Synchronous Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 4.6.2 Off-Line Asynchronous Mode . . . . . . . . . . . . . . . . . . . . . . . . . . 58 4.7 Experimental Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 4.7.1 The user-interface of the client-program . . . . . . . . . . . . . . . . . . . . . 58 4.7.2 The user-interface of the server-program. . . . . . . . . . . . . . . . . . . . . 64 4.7.3 The Experimental Result of DisplayingDifferentphotoClusters . . . . . . . . 64 4.7.4 The Experimental Result of DisplayingDifferentMapScales . . . . . . . . . . 66 4.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 5 Conclusions 753068970 bytesapplication/pdfen-US資訊勘測決策樹個人化搜尋經緯度叢集法pervasive applicationsdata miningdecision treepersonalized searchgeotagged clustering應用於廣泛網路應用之資訊勘測Mining Framework for Pervasive Applicationsthesishttp://ntur.lib.ntu.edu.tw/bitstream/246246/53109/1/ntu-96-F86921019-1.pdf