Extending Resource Management System based on Heterogeneous Hadoop Yarn Platform
Apache Hadoop近年來的蓬勃發展廣泛用在巨量資料的應用上, 但 Hadoop 系統框架在CPU上所遇到的效能瓶頸時常為人所詬病, 如果要將 Hadoop 應用程 式移植到圖形處理器上達到效能提升的目的, 程序員必須要花很多額外的心力。 在這篇理論當中我們將利用Aparapi 程式庫來做到將 Hadoop 應用程式移植到圖 形處理器上執行, 並且探討 Hadoop YARN 框架在異質性平台上的資源管理。
Apache Hadoop allows developers for the distributed processing of large data sets across clusters of computers using simple programming models. The booming of Apache Hadoop solves many kinds of big data problems, and it is very suitable for parallel processing. But the poor performance of Hadoop applications due to the bottlenecks of computing is always reviled. Our research will proposed a framework which combines Haddop YARN and GPU, porting Aparapi libraries into YARN system for computing resources management in heterogeneous platforms. Extended the Application Master, which is a core component in YARN architecture to act as a role of resources request decision maker based on our scheduling algorithm. Besides, we adopt a preemptive, locality-aware task scheduling mechanism to fairly share CPU AND GPU resources. In the experiments, we show the overall speedup of an application, and analyze the effects to performance.
|Appears in Collections:||資訊工程學系|
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.