Optimization of Hadoop System Configuration Parameters
Date Issued
2015
Date
2015
Author(s)
Zhuo, Ye-Qi
Abstract
Hadoop system is very popular recent year, which is a software framework with distributed processing large-scale data-sets by using a cluster of machines with MapReduce programming model. However, there are still two essential challenges for Hadoop users to manage the Hadoop system. (1) To tune the parameters appropriately; (2) To deal with dozens of configuration parameters which are involved to its performance. This paper will focus on optimizing the Hadoop MapReduce job performance. Our approach has two key model: Prediction and Optimization. The Prediction model is to estimate execution time of a MapReduce job and the Optimization model is to search the approximately optimal configuration parameters by invoking the prediction part repeatedly. By using an analytical method to choose approximately optimal configuration parameters to improve users’ job performance . Besides the configuration parameter tuning, the relevance of each parameters and the evaluation of our methods will also be discussed in this paper. Our paper may provide users a better method to improve the Hadoop system performance and save the hardware resource.
Subjects
tuning
optimization
predictor
Type
thesis
File(s)![Thumbnail Image]()
Loading...
Name
ntu-104-R02922142-1.pdf
Size
23.32 KB
Format
Adobe PDF
Checksum
(MD5):0580c9825dc771e06e2661ff1629005a