劉邦鋒臺灣大學:資訊工程學研究所周恩冉Chou, En-JanEn-JanChou2007-11-262018-07-052007-11-262018-07-052007http://ntur.lib.ntu.edu.tw//handle/246246/53763幾乎所有的叢集與格網系統都仰賴資料以計算結果,並且在資料可被取得之前計算工作是無法開始的。因此恰當的安排資料傳輸以及工作執行對於整體的效率可以產生顯著的影響。在本篇論文中我們分別就考慮儲存空間限制與否分析了互享資料工作之排程問題的計算複雜度,我們展示了當儲存空間受到限制時即使每個工作最多只需要三份資料,這個問題仍然是 NP-Complete 的。另一方面,我們也展示了當儲存空間不受限制時,若是每個工作最多只需要兩份資料,那麼我們可以很有效率地找到最佳工作排程。我們也提出了一個很有效率的經驗法則演算法可在工作所需要的資料數量不受限制時找到很好的排程,實驗結果也顯示這個演算法表現地相當好,可以找到非常接近最佳解的排程。Almost every computation job in the cluster or grid systems requires input data in order to find the solution, and the computation cannot proceed without the required data become available. As a result a proper interleaving of data transfer and job execution has a significant impact on the overall efficiency. In this paper we analyze the computational complexity of the shared data job scheduling problem, with and without consideration of storage capacity constraint. We show that if there is an upper bound on the server capacity, the problem is NP-complete, even when each job depends on at most three data. On the other hand, if there is no upper bound on the server capacity, we show that there exists an efficient algorithm that gives optimal job schedule when each job depends on at most two data. We also give an effective heuristic algorithm that gives good schedule for cases where there is no limit on the number of data a job may access. Experimental results indicate that this heuristic algorithm performs very well, and gives near optimal solutions.1 Introduction 1 2 System Model 5 2.1 Unlimited Capacity Model . . . . . . . . . . . . . . . . . . . . 5 2.2 Limited Capacity Model . . . . . . . . . . . . . . . . . . . . . 6 2.3 Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3 Algorithms for Unlimited Capacity Model 8 3.1 Optimal Algorithm for Jobs with Two Data . . . . . . . . . . 11 4 Limited Capacity Model 18 5 Heuristic Algorithm 22 5.1 Minimum-Upload-Maximum-Execute . . . . . . . . . . . . . . 23 5.2 Longest Job First . . . . . . . . . . . . . . . . . . . . . . . . . 23 5.3 Earliest Completion First . . . . . . . . . . . . . . . . . . . . . 24 5.4 Random . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 6 Experimental Result 25 7 Conclusion 30339238 bytesapplication/pdfen-US互享資料, 工作排程shared data, job scheduling互享資料工作之計算與通訊排程最佳化Computation and Communication Schedule Optimization for Jobs with Shared Datathesishttp://ntur.lib.ntu.edu.tw/bitstream/246246/53763/1/ntu-96-R94922149-1.pdf