Abstract
Data distribution in memory or on disks is an important factor influencing the performance of parallel applications. On the other hand, programs or systems, like a parallel file system, frequently redistribute data between memory and disks.
This paper presents a generalization of previous approaches of the redistribution problem. We introduce algorithms for mapping between two arbitrary distributions of a data set. The algorithms are optimized for multidimensional array partitions. We motivate our approach and present potential utilizations. The paper also presents a case study, the employment of mapping functions, and redistribution algorithms in a parallel file system.
Similar content being viewed by others
References
DeBenedictis E, Rosario JD (1992) nCUBE parallel I/O software. In: Proceedings of 11th international Phoenix conference on computers and communication
LoVerso S, Isman M, Nanopoulos A, Nesheim W, Milne E, Wheeler R (1993) sfs: a parallel file system for the CM-5. In: Proceedings of the summer 1993 USENIX conference, pp 291–305
Moyer S, Sunderam V (1994) PIOUS: a scalable parallel I/O system for distributed computing environments. In: Proceedings of the scalable high-performance computing conference
Huber J, Elford C, Reed D, Chien A, Blumenthal D (1995) PPFS: a high performance portable file system. In: Proceedings of the 9th ACM international conference on supercomputing
Corbett P, Feitelson D (1996) The Vesta parallel file system. ACM Trans Comput Syst
Freedman C, Burger J, DeWitt D (1996) SPIFFI—a scalable parallel file system for the Intel Paragon. IEEE Trans Parallel Distributed Syst
Carretero J, Serez F, Miguel P, Garca F, Alonso L (1996) ParFiSys: a parallel file system for MPP. ACM SIGOPS 30
Nieuwejaar N, Kotz D (1997) The galley parallel file system. Parallel Comput
Brodowicz M, Johnson O (1998) Paradise: an advanced featured parallel file system. In: Press, A. (ed) Proceedings of the international conference on supercomputing, pp 220–226
III WL, Ross R (1999) An overview of the parallel virtual file system. In: Proceedings of the extreme Linux workshop
Schmuck F, Haskin R (2002) GPFS: a shared-disk file system for large computing clusters. In: Proceedings of FAST
Winslett M, Seamons K, Chen Y, Cho Y, Kuo S, Subramaniam M (1996) The Panda library for parallel I/O of large multidimensional arrays. In: Proceedings of scalable parallel libraries conference III
Message Passing Interface Forum (1997) MPI2: extensions to the message passing interface
Nieuwejaar N, Kotz D, Purakayastha A, Ellis C, Best M (1996) File access characteristics of parallel scientific workloads. IEEE Trans Parallel Distributed Syst 7(10)
Smirni E, Reed D (1997) Workload characterization of I/O intensive parallel applications. In: Proceedings of the conference on modelling techniques and tools for computer performance evaluation
Simitici H, Reed D (1998) A comparison of logical and physical parallel I/O patterns. Int J High Perform Comput Appl 12(3)
Ramaswamy S, Banerjee P (1995) Automatic generation of efficient array redistribution routines for distributed memory multicomputers. In: Proceedings of Frontiers ’95: the fifth symposium on the frontiers of massively parallel computation, McLean
Corbett P, Feitelson D, Prost JP, Almasi G, Baylor S, Bolmaricich A, Hsu Y, Satran J, Snir M, Colao R, Herr B, Kavaky J, Morgen T, Zlotek A (1995) Parallel file systems for IBM SP computers. IBM Syst J
Loveman DB (1993) High performance Fortran. IEEE Parallel Distributed Technol
Message Passing Interface Forum (1995) MPI: a message-passing interface standard
Isaila F, Tichy W (2001) Clusterfile: a flexible physical layout parallel file system. In: First IEEE international conference on cluster computing
Isaila F, Tichy W (2003) View I/O: improving the performance of non-contiguous I/O. In: Third IEEE international conference on cluster computing, pp 336–343
Isaila F, Tichy W (2003) Clusterfile: a flexible physical layout parallel file system. Concurr Comput Pract Experience 15:653–679
Isaila F, Malpohl G, Olaru V, Szeder G, Tichy W (2004) Integrating collective I/O and cooperative caching into the “clusterfile” parallel file system. In: Proceedings of ACM international conference on supercomputing (ICS)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Isaila, F., Tichy, W.F. Mapping functions and data redistribution for parallel files. J Supercomput 46, 213–236 (2008). https://doi.org/10.1007/s11227-007-0165-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-007-0165-x