A Cost-Effective System for Real-Time Big Data Processing
Date Issued
2016
Date
2016
Author(s)
Tsai, Linjiun
Abstract
The emerging Big Data paradigm has attracted attention from a wide variety of industry sectors, including healthcare, finance, retail, and manufacturing. To process massive heterogeneous data in a near real-time manner, Big Data applications should be run on dedicated server clusters that aggregate huge computing power, memory and storage through fast, unimpeded and reliable network infrastructures. Implementing such high-performance cluster computing is typically not economical for companies that only have occasional demand for Big Data processing. Cloud computing is considered a viable solution to reducing operating costs for Big Data applications due to its on-demand, pay-per-use and scalable nature. The shared nature of cloud data centers, however, may make application performance unpredictable. The strict network requirements and extremely large memory demands of Big Data clusters also lead to difficulties in optimizing the allocation of cloud resources. These difficulties translate into higher hosting cost per application. This dissertation proposes a solution to these problems that allows more concurrent Big Data applications to be deployed in cloud data centers in the most resource-efficient way while meeting their real-time requirements. To this end, we present 1) the first resource allocation framework that guarantees network performance for each Big Data cluster in multi-tenant clouds, 2) the first machine learning model that predicts the most efficient memory size for each Big Data cluster according to given upper bounds on performance penalties, and 3) an adaptive resource consolidation mechanism that strikes a balance between the number of required servers and the overhead of dynamic server consolidation for each cluster. The resource allocation framework takes advantage of the symmetry of the fat-tree network structure to enable data center networks to be efficiently partitioned into mutually exclusive and collectively exhaustive star networks, each allocated to a Big Data cluster. It provides several promising properties: 1) every cluster is isolated from other ones; 2) the topology for every cluster is non-blocking for arbitrary traffic pattern; 3) the number of links to form each cluster is the minimum; 4) the per-hop distance between any two servers in a cluster is equal; 5) the network topology allocated to each cluster is guaranteed logically unchanged during and after reallocation; 6) for fault tolerant allocation, the number of backup links connecting backup and active servers is the minimum; 7) the data center networks can be elastically trimmed and expanded while maintaining all the properties above. Based on the promising properties of this framework, a cost-bounded resource reallocation mechanism is also proposed, making nearly full use of cloud resources in polynomial time. The model for predicting the optimal memory size is designed to capture the memory management behaviors of Java virtual machines as well as the dynamic changes in memory consumption on distributed compute nodes. Through experiments on a physical Spark cluster containing 128 cores and 1 TB of memory, the model shows good prediction accuracy and saves a significant amount of memory space for operating Big Data applications that demand up to hundreds of gigabytes of working memory.
Subjects
Cloud Computing
Big Data
Resource Optimization
Memory Management
Performance-Cost Trade-off
Performance Guarantee
Network Optimization
Type
thesis
File(s)![Thumbnail Image]()
Loading...
Name
ntu-105-D97921014-1.pdf
Size
23.32 KB
Format
Adobe PDF
Checksum
(MD5):65c9285a7e3cec485233b03d61d724ae
