Resource Placement and Scheduling for Distributed Systems

Lin, Yi-Fang

Resource Placement and Scheduling for Distributed Systems

Date Issued

2009

Date

2009

Author(s)

Lin, Yi-Fang

URI

http://ntur.lib.ntu.edu.tw//handle/246246/185353

Abstract

Applying distributed systems is a typical solution for data intensive applications to collect large computational power to handle the enormous data. To enhance overall performance of the distributed systems, we need to address two important groups of problems about how to manage the distributed resources. The first group is how to place the resources at the proper locations of the network to achieve load balance, and the second one is how to schedule the requests of the shared resources to reduce the overhead caused by the requests that share the same resources.n the first problem group, we investigate the I/O server placement and data replica placement. Parallel I/O techniques can help solve the serious bottleneck of performance caused by I/O. However, switch-based clusters of workstations/PCs and distributed systems typically adopteneral topologies to allow the construction of scalable systems with incremental expansion capability. These general topologies lack many of the attractive mathematical properties of regular topologies, whichakes optimizing parallel I/O performance on general networks a difficult task. Therefore, we optimize server placement for parallel I/O in switch-Based clusters to balance the workload among the I/O servers. In addition, data replication is a typical strategy for improving access performance and data availability in distributedystems with data intensive applications (especially in Data Grids). The existing works usually focus on the infrastructure for data replication and the mechanism of replicas creation and deletion, but the important problem of choosing suitable locations for placing replicas has not been fully studied. Thus, we also address replicalacement problem in Data Grids.n the second problem group, we discuss parallel I/O scheduling and multicast scheduling. The lack of global information about I/O traffic between computing nodes and I/O servers impose new challenges in optimizing parallel I/O for distributed systems. Therefore, we developwo distributed algorithms for parallel I/O scheduling with non-uniform data sizes. Moreover, multicast is an important communication pattern, with applications in collective communication operations, and theandwidth limitation of the links in the routing tree for general topologies make multicast scheduling critical. Thus, we propose an agent based multicast algorithm that guarantee contention free multicast by exploiting the properties of routing tree for general network.ajor contributions of this dissertation are summarized as follow. First, in I/O server placement, we formulate the problem as a weighted bipartite matching with the goal of balancing the workload on the I/O servers, and we propose an efficient algorithm to find an optimal solution. To minimize link contention among the subclusters connecteds a general topology, we devise a tree-based heuristic algorithm to assign servers among subclusters. Our simulation results demonstrate that our best algorithm is near-optimal in some cases. Second, in replica placement in a Data Grid, we propose a placement algorithm thatinds optimal locations for replicas so that the workload among the replicas is balanced, and we also propose an algorithm that determines the minimum number of replicas when the maximum workload capacity of each replica is given. Third, in parallel I/O scheduling problem, weropose distributed scheduling algorithms, and our experimental results indicate that our algorithms yield parallel performance within 6% of the centralized solutions. We also compare the performance of ourlgorithms with a distributed Highest Degree First method, which divides non-uniform data transfers into units of fixed-sized blocks. The experimental results show that our algorithms require less scheduling and data transfer time. Finally, in multicast scheduling for general networks, our experimental results demonstrate that our agent-based algorithm outperforms the most efficient algorithm reportedn existing literature.

Subjects

resource

placement

scheduling

distributed system

replica placement

I/O server

parallel I/O

multicast

I/O scheduling

up-down routing

Grid

Type

thesis

File(s)

Name

ntu-98-D92922009-1.pdf

Size

23.32 KB

Format

Adobe PDF

Checksum

(MD5):d4bdfa34c96105e231da7c933a90682a

Resource Placement and Scheduling for Distributed Systems

關於 (About)

聯絡資訊 (Contact Us)

相關網站 (Useful Links)

關於開放取用 (Open Access, OA)

出版社期刊論文授權政策 (Copyright)

使用說明 (Instructions)

登入說明 (Sign-in)

匯入著作 (Submission)