Computation and Communication Schedule Optimization for Jobs with Shared Data
Date Issued
2007
Date
2007
Author(s)
Chou, En-Jan
DOI
en-US
Abstract
Almost every computation job in the cluster or grid systems requires input data in order to find the solution, and the computation cannot proceed without the required data become available. As a result a proper interleaving of data transfer and job execution has a significant impact on the overall efficiency. In this paper we analyze the computational complexity of the shared data job scheduling problem, with and without consideration of storage capacity constraint. We show that if there is an upper bound on the server capacity, the problem is NP-complete, even when each job depends on at most three data. On the other hand, if there is no upper bound on the server capacity, we show that there exists an efficient algorithm that gives optimal job schedule when each job depends on at most two data. We also give an effective heuristic algorithm that gives good schedule for cases where there is no limit on the number of data a job may access. Experimental results indicate that this heuristic algorithm performs very well, and gives near optimal solutions.
Subjects
互享資料, 工作排程
shared data, job scheduling
Type
thesis
File(s)![Thumbnail Image]()
Loading...
Name
ntu-96-R94922149-1.pdf
Size
23.31 KB
Format
Adobe PDF
Checksum
(MD5):ffd4945c593213e79ed6db56b38aabd4