電機資訊學院: 資訊工程學研究所指導教授: 廖世偉沈建志Shen, Jian-JangJian-JangShen2017-03-032018-07-052017-03-032018-07-052015http://ntur.lib.ntu.edu.tw//handle/246246/275471Apache Hadoop近年來的蓬勃發展廣泛用在巨量資料的應用上, 但 Hadoop 系統框架在CPU上所遇到的效能瓶頸時常為人所詬病, 如果要將 Hadoop 應用程 式移植到圖形處理器上達到效能提升的目的, 程序員必須要花很多額外的心力。 在這篇理論當中我們將利用Aparapi 程式庫來做到將 Hadoop 應用程式移植到圖 形處理器上執行, 並且探討 Hadoop YARN 框架在異質性平台上的資源管理。Apache Hadoop allows developers for the distributed processing of large data sets across clusters of computers using simple programming models. The booming of Apache Hadoop solves many kinds of big data problems, and it is very suitable for parallel processing. But the poor performance of Hadoop applications due to the bottlenecks of computing is always reviled. Our research will proposed a framework which combines Haddop YARN and GPU, porting Aparapi libraries into YARN system for computing resources management in heterogeneous platforms. Extended the Application Master, which is a core component in YARN architecture to act as a role of resources request decision maker based on our scheduling algorithm. Besides, we adopt a preemptive, locality-aware task scheduling mechanism to fairly share CPU AND GPU resources. In the experiments, we show the overall speedup of an application, and analyze the effects to performance.論文使用權限: 不同意授權巨量資料異質性分散式系統平台圖形處理HadoopMapReduceYarnOpenCLAparapiHadoop 於異質平台之資源管理系統Extending Resource Management System based on Heterogeneous Hadoop Yarn Platformthesis