Extending Resource Management System based on Heterogeneous Hadoop Yarn Platform
Date Issued
2015
Date
2015
Author(s)
Shen, Jian-Jang
Abstract
Apache Hadoop allows developers for the distributed processing of large data sets across clusters of computers using simple programming models. The booming of Apache Hadoop solves many kinds of big data problems, and it is very suitable for parallel processing. But the poor performance of Hadoop applications due to the bottlenecks of computing is always reviled. Our research will proposed a framework which combines Haddop YARN and GPU, porting Aparapi libraries into YARN system for computing resources management in heterogeneous platforms. Extended the Application Master, which is a core component in YARN architecture to act as a role of resources request decision maker based on our scheduling algorithm. Besides, we adopt a preemptive, locality-aware task scheduling mechanism to fairly share CPU AND GPU resources. In the experiments, we show the overall speedup of an application, and analyze the effects to performance.
Subjects
Hadoop
MapReduce
Yarn
OpenCL
Aparapi
Type
thesis
