Resource Placement and Scheduling for Distributed Systems
Date Issued
2009
Date
2009
Author(s)
Lin, Yi-Fang
Abstract
Applying distributed systems is a typical solution for data intensive applications to collect large computational power to handle the enormous data. To enhance overall performance of the distributed systems, we need to address two important groups of problems about how to manage the distributed resources. The first group is how to place the resources at the proper locations of the network to achieve load balance, and the second one is how to schedule the requests of the shared resources to reduce the overhead caused by the requests that share the same resources.n the first problem group, we investigate the I/O server placement and data replica placement. Parallel I/O techniques can help solve the serious bottleneck of performance caused by I/O. However, switch-based clusters of workstations/PCs and distributed systems typically adopteneral topologies to allow the construction of scalable systems with incremental expansion capability. These general topologies lack many of the attractive mathematical properties of regular topologies, whichakes optimizing parallel I/O performance on general networks a difficult task. Therefore, we optimize server placement for parallel I/O in switch-Based clusters to balance the workload among the I/O servers. In addition, data replication is a typical strategy for improving access performance and data availability in distributedystems with data intensive applications (especially in Data Grids). The existing works usually focus on the infrastructure for data replication and the mechanism of replicas creation and deletion, but the important problem of choosing suitable locations for placing replicas has not been fully studied. Thus, we also address replicalacement problem in Data Grids.n the second problem group, we discuss parallel I/O scheduling and multicast scheduling. The lack of global information about I/O traffic between computing nodes and I/O servers impose new challenges in optimizing parallel I/O for distributed systems. Therefore, we developwo distributed algorithms for parallel I/O scheduling with non-uniform data sizes. Moreover, multicast is an important communication pattern, with applications in collective communication operations, and theandwidth limitation of the links in the routing tree for general topologies make multicast scheduling critical. Thus, we propose an agent based multicast algorithm that guarantee contention free multicast by exploiting the properties of routing tree for general network.ajor contributions of this dissertation are summarized as follow. First, in I/O server placement, we formulate the problem as a weighted bipartite matching with the goal of balancing the workload on the I/O servers, and we propose an efficient algorithm to find an optimal solution. To minimize link contention among the subclusters connecteds a general topology, we devise a tree-based heuristic algorithm to assign servers among subclusters. Our simulation results demonstrate that our best algorithm is near-optimal in some cases. Second, in replica placement in a Data Grid, we propose a placement algorithm thatinds optimal locations for replicas so that the workload among the replicas is balanced, and we also propose an algorithm that determines the minimum number of replicas when the maximum workload capacity of each replica is given. Third, in parallel I/O scheduling problem, weropose distributed scheduling algorithms, and our experimental results indicate that our algorithms yield parallel performance within 6% of the centralized solutions. We also compare the performance of ourlgorithms with a distributed Highest Degree First method, which divides non-uniform data transfers into units of fixed-sized blocks. The experimental results show that our algorithms require less scheduling and data transfer time. Finally, in multicast scheduling for general networks, our experimental results demonstrate that our agent-based algorithm outperforms the most efficient algorithm reportedn existing literature.
Subjects
resource
placement
scheduling
distributed system
replica placement
I/O server
parallel I/O
multicast
I/O scheduling
up-down routing
Grid
Type
thesis
File(s)![Thumbnail Image]()
Loading...
Name
ntu-98-D92922009-1.pdf
Size
23.32 KB
Format
Adobe PDF
Checksum
(MD5):d4bdfa34c96105e231da7c933a90682a
