loading...
A Case Study of Parallel I/O for Biological Sequence Search on Linux Clusters
Hong Kong December 01-December 04
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/CLUSTR.2003.1253329Fifth IEEE International Conference o ...
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Yifeng Zhu, University of Nebraska-Lincoln
Hong Jiang, University of Nebraska-Lincoln
Xiao Qin, University of Nebraska-Lincoln
David Swanson, University of Nebraska-Lincoln

In this paper we analyze the I/O access patterns of a widely-used biological sequence search tool and implement two variations that employ parallel-I/O for data access based on PVFS (Parallel Virtual File System) and CEFT-PVFS (Cost-Effective Fault-Tolerant PVFS). Experiments show that the two variations outperform the original tool when equal or even fewer storage devices are used in the former. It is also found that although the performance of the two variations improves consistently when initially increasing the number of servers, this performance gain from parallel I/O becomes insignificant with further increase in server number.

We examine the effectiveness of two read performance optimization techniques in CEFT-PVFS by using this tool as a benchmark. Performance results indicate: (1) Doubling the degree of parallelism boosts the read performance to approach that of PVFS; (2) Skipping hot-spots can substantially improve the I/O performance when the load on data servers is highly imbalanced. The I/O resource contention due to the sharing of server nodes by multiple applications in a cluster has been shown to degrade the performance of the original tool and the variation based on PVFS by up to 10 and 21 folds, respectively; whereas, the variation based on CEFT-PVFS only suffered a two-fold performance degradation.

Index Terms:
parallel I/O, CEFT-PVFS, PVFS, BLAST
Citation:
Yifeng Zhu, Hong Jiang, Xiao Qin, David Swanson, "A Case Study of Parallel I/O for Biological Sequence Search on Linux Clusters," cluster, pp.308, Fifth IEEE International Conference on Cluster Computing (CLUSTER'03), 2003
Usage of this product signifies your acceptance of the Terms of Use.