Query Processing Techniques in Dynamic Database Systems
Date Issued
2007
Date
2007
Author(s)
Hung, Hao-Ping
DOI
en-US
Abstract
The technology advances in computing powers have enhanced the capabilities for a database management system to process the user queries effectively and efficiently. However, in many emerging applications such as mobile computing environments and streaming environments, the conventional techniques for query optimization may suffer from the dynamics in either user queries or evolving data entities. For these dynamic database systems, how to process query within real time and provide accurate answers becomes a challenging issue.
In the mobile computing environments, broadcast-based dissemination is a scalable way to deliver data items to interested users. According to the probability distribution of user queries, the broadcast server will allocate the items into broadcast channels in such a way the average waiting time of mobile users can be minimized. In view of the fact that various items with different sizes are disseminated in modern information service, we explore in Chapter 2 the issue of scheduling heterogeneous items in the data broadcasting environments. Given the broadcast database and the number of channels, we first derive the analytical model of the heterogeneous data broadcasting to obtain the average waiting time of mobile users, and formulate the problem as a grouping problem. In order to solve such problem, we propose a two-phase architecture to perform channel allocation. In addition to the two-phase architecture, we also propose algorithm GA-CDMS according to the concept of hybrid genetic algorithm for comparison purposes.
Moreover, under many circumstances, the mobile users will tend to access multiple items within a specific session. Consider a special case in which the mobile users wish to download these items within a sequential order. To minimize the average access time, the information server should schedule these queries according to the sequential relationship among the items. In Chapter 3, we study the scheduling approach in such a sequential data broadcasting environment. Explicitly, we propose a general framework referred to as MULS for an information system. There are two primary stages in MULS: on-line scheduling and optimization procedure. By cooperating algorithm OLS with procedure SCI, the proposed MULS framework is able to generate broadcast programs with flexibility of providing different service qualities under different requirements of effectiveness and efficiency.
As for the streaming environments, the rapid evolving speed and unlimited number of items make these data be scanned at most only once. Due to the resource limitation in the data stream environment, it has been reported that answering user queries according to the wavelet synopsis of a stream is an essential ability of a Data Stream Management System (DSMS). We first study in Chapter 4 the problem of maintaining the wavelet coefficients of multiple streams within collective memory so that the predetermined global error metric is minimized. Moreover, we also examine a promising application in the multistream environment, i.e., the top-k queries. We resolve the problem of efficient top-k query processing with minimized global error by developing a general framework. For the purposes of maintaining the wavelet coefficients and processing top-k queries, several well-designed algorithms are utilized to optimize the performance of each primary component of this general framework.
In this dissertation, motivated by the fact that the data cells in streaming environments are usually transformed to coefficients in the frequency domain, we also attempt to address an essential problem "How to obtain the time-domain similarity between two streams from wavelet coefficients in the frequency domain?". In Chapter 5, we investigate two important types of range-constrained queries in time series streaming environments: the distance queries (which aim at obtaining the Euclidean distance between two streams) and the kNN queries (which aim at discovering k nearest neighbors to a reference stream). To achieve high efficiency in processing these two types of queries, we propose procedure RED and algorithm EKS. Compared to the existing methods in the prior research, the advantageous features of our approaches are in two folds. First, our approaches are capable of processing the queries directly from the wavelet synopses retained in the main memory without using IDWT to reconstruct the data cells. Moreover, our approaches enable the users to query the DSMS within their range of interest.
Subjects
資料庫系統
使用者查詢
行動計算
資料流
小波轉換
Query Processing
Dynamic Databases
Mobile Computing
Data Streams
Wavelet Transform
Type
thesis
File(s)![Thumbnail Image]()
Loading...
Name
ntu-96-F90942056-1.pdf
Size
23.31 KB
Format
Adobe PDF
Checksum
(MD5):5a2495e8acfd500839ab2684823f9e84
