Improved Read Performance in CEFT-PVFS: Cost Effective, Fault-Tolerant Parallel Virtual File System

[Back] [PDF]

Improved Read Performance in CEFT-PVFS: Cost Effective,
Fault-Tolerant Parallel Virtual File System
Yifeng Zhu, Hong Jiang, Xiao Qin, Dan Feng and David R. Swanson

Department of Computer Science and Engineering
University of Nebraska-Lincoln
Lincoln, NE 68588-0115, {yzhu, jiang, xqin, dswanson}@cse.unl.edu

Due to the ever-widening performance gap between processors and disks, I/O operations tend to become the major performance bottleneck of data-intensive applications on modern clusters. If all the existing disks on the nodes of a cluster are connected together to establish high performance parallel storage systems, the cluster’s overall performance can be boosted at no additional cost. CEFT-PVFS (a RAID 10 style parallel file system that extends the original PVFS), as one such system, divides the cluster nodes into two groups, stripes the data across one group in a round-robin fashion, and then duplicates the same data to the other group to provide storage service of high performance and high reliability. Previous research has shown that the system reliability is improved by a factor of more than 40 with mirroring while maintaining a comparable write performance. This paper presents another benefit of CEFT-PVFS in which the aggregate peak read performance can be improved by as much as 100% over that of the original PVFS by exploiting the increased parallelism. Additionally, when the data servers, which typically
are also computational nodes in a cluster environment, are loaded in an unbalanced way by applications running in the cluster, the read performance of PVFS will be degraded significantly. On the contrary, in the CEFT-PVFS, a heavily loaded data server can be skipped and all the desired data is read from its mirroring node. Thus the performance will not be affected unless both the server node and its mirroring node are heavily loaded.

in Proceedings of IEEE/ACM CCGRID Workshop on Parallel I/O in Cluster Computing and Computational Grids,
Japan, May 2003.