Abstract
The main contributors to message delivery latency in message passing environments are the copying operations needed to transfer and bind a received message to the consuming process/thread. To reduce this copying overhead, we introduce architectural extensions comprising a specialized network cache and instructions. In this work, we study the possible overhead and cache pollution introduced through the operating system and the communications stack as exemplified by Linux, TCP/IP and M-VIA. We introduce this overhead in our simulation environment and study its effects on our proposed extensions. Ultimately, we have been able to compare the performance achieved by an application running on a system incorporating our extensions with the performance of the same application running on a standard system. The results show that our proposed approach can improve the performance of MPI applications by 10% to 20%.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Khunjush, F., Dimopoulos, N.J.: Lazy Direct-To-Cache Transfer during Receive Operations in a Message Passing Environment. In: Proceedings, the 3rd ACM International Conference on Computing Frontiers, CF 2006, pp. 331–340 (2006)
Khunjush, F., Dimopoulos, N.J.: Evaluation of Direct-To-Cache Transfer during Receive Operations in a Message Passing Environment. In: Proceedings, the Second International Workshop on Advanced Networking and Communications Hardware, ANCHOR2005, in conjunction with ISCA-32, pp. 22–29 (2005)
Khunjush, F., Dimopoulos, N.J.: Hiding Message Delivery and Reducing Memory Access Latency by providing Direct-to-Cache Transfer during Receive Operations in a Message Passing Environment. ACM SIGARCH Computer Architecture News 34(1), 41–48 (2006)
Afsahi, A., Dimopoulos, N.J.: Architectural Extensions to Support Efficient Communication Using Message Prediction. In: Proceedings, HPCS2002, pp. 20–27 (2002)
Dubunicki, S., et al.: The Virtual Interface Architecture. IEEE Micro, 66–76 (March-April 1998)
Engblom, J., et al.: Developing Embedded Networked Products using the Simics Full-System Simulator. In: Proceedings PIMRC 2005 (2005)
MPICH-A Portable Implementation of MPI: available at http://www-unix.mcs.anl.gov/mpi/mpich1/
MVICH: MPI for Virtual Interface Architecture, http://www.nersc.gov/research/FTG/mvich/index.html
Bailey, D., et al.: The NAS Parallel Benchmarks 2.0: Report NAS-95-020. Nasa Ames Research Center (1995)
Worley, P., Foster, I.: Parallel Spectral Transform Shallow Water Model: A Runtime-tunable parallel benchmark code. In: Proceedings of the Scalable High Performance Computing Conference, pp. 207–214 (1994)
Austin, T., et al.: SimpleScalar: an infrastructure for computer system modeling. IEEE Computer 35(2), 59–67 (2002)
Boden, N., et al.: Myrinet: A Gigabit-per-Second Local Area Network. IEEE Micro (1995)
InfiniBand Trade Association: InfiniBand Architecture Specification, http://www.infinibandta.org
Dubnicki, C., et al.: VMMC-2: Efficient Support for Reliable, Connection-Oriented Communication. In: Proceedings of the Hot Interconnect 1997 (1997)
Rodrigues, S., et al.: High-Performance Local Area Communication with Fast Sockets. In: USENIX 1997 (1997)
Basu, A., Welsh, M., Eicken, T.V.: Incorporating Memory Management into User-Level Network Interface. Hot Interconnects V (1997)
Banikazemi, M., et al.: MPI-LAPI: An Efficient Implementation of MPI for IBM RS/6000 SP Systems. IEEE Trans. Parallel Distri. Systems 12(10), 1081–1093 (2001)
Chu, H.: Zero-copy TCP in Solaris. In: Proceedings of the USENIX Annual Technical Conference, pp. 253–263 (1996)
Alacritech, Inc.: Allacritech / SLIC technology overview, http://www.alacritech.com/html/tech_review.html
Binkert, N.L., et al.: Performance Analysis of System Overheads in TCP/IP Workloads. In: Malyshkin, V. (ed.) PaCT 2005. LNCS, vol. 3606, Springer, Heidelberg (2005)
Huggahalli, R., Iyer, R., Tetrick, S.: Direct Cache Access for High Bandwidth Network I/O. In: Proceedings, ISCA-32, pp. 50–59 (2005)
Lauritzen, K., et al.: Intel I/O acceleration technology improves network performance, reliability and efficiently. Technology@Intel magazine (2005), http://www.intel.com/technology/magazine/communications/Intel-IOAT-0305.pdf
RDMA Consortium: http://www.rdmaconsortium.org/
Acacio, M.E., et al.: Owner Prediction for Accelerating Cache-to-Cache Transfers in a cc-NUMA Architecture. In: Proceedings, SC 2002 (2002)
Kim, J., Lilja, D.J.: Characterization of Communication Patterns in Message-Passing Parallel Scientific Application Programs. In: Proceedings of the Workshop on Communication, Architecture, and Applications for Network-based Parallel Computing, HPCA-4, pp. 202–216 (1998)
Afsahi, A., Dimopoulos, N.J.: Efficient Communication Using Message Prediction for Cluster of Multiprocessors. In: Falsafi, B. (ed.) CANPC 2000. LNCS, vol. 1797, pp. 162–178. Springer, Heidelberg (2000)
M-VIA: Virtual Interface Architecture for Linux (2001), Was available at http://www.nserc.gov/research/FTG/via/
Bryant, R.E., O’Hallaron, D.R.: Computer Systems: A Programmer’s Perspective. Prentice-Hall, Englewood Cliffs (2003)
Cappelo, F., Etiemble, D.: MPI versus MPI+OpenMP on the IBM SP for the NAS Benchmarks. In: Reich, S., Anderson, K.M. (eds.) Open Hypermedia Systems and Structural Computing. LNCS, vol. 1903, Springer, Heidelberg (2000)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Khunjush, F., Dimopoulos, N.J. (2007). Comparing Direct-to-Cache Transfer Policies to TCP/IP and M-VIA During Receive Operations in MPI Environments. In: Stojmenovic, I., Thulasiram, R.K., Yang, L.T., Jia, W., Guo, M., de Mello, R.F. (eds) Parallel and Distributed Processing and Applications. ISPA 2007. Lecture Notes in Computer Science, vol 4742. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74742-0_21
Download citation
DOI: https://doi.org/10.1007/978-3-540-74742-0_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74741-3
Online ISBN: 978-3-540-74742-0
eBook Packages: Computer ScienceComputer Science (R0)