Skip to main content

Comparing Direct-to-Cache Transfer Policies to TCP/IP and M-VIA During Receive Operations in MPI Environments

  • Conference paper
Parallel and Distributed Processing and Applications (ISPA 2007)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4742))

  • 815 Accesses

Abstract

The main contributors to message delivery latency in message passing environments are the copying operations needed to transfer and bind a received message to the consuming process/thread. To reduce this copying overhead, we introduce architectural extensions comprising a specialized network cache and instructions. In this work, we study the possible overhead and cache pollution introduced through the operating system and the communications stack as exemplified by Linux, TCP/IP and M-VIA. We introduce this overhead in our simulation environment and study its effects on our proposed extensions. Ultimately, we have been able to compare the performance achieved by an application running on a system incorporating our extensions with the performance of the same application running on a standard system. The results show that our proposed approach can improve the performance of MPI applications by 10% to 20%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Khunjush, F., Dimopoulos, N.J.: Lazy Direct-To-Cache Transfer during Receive Operations in a Message Passing Environment. In: Proceedings, the 3rd ACM International Conference on Computing Frontiers, CF 2006, pp. 331–340 (2006)

    Google Scholar 

  2. Khunjush, F., Dimopoulos, N.J.: Evaluation of Direct-To-Cache Transfer during Receive Operations in a Message Passing Environment. In: Proceedings, the Second International Workshop on Advanced Networking and Communications Hardware, ANCHOR2005, in conjunction with ISCA-32, pp. 22–29 (2005)

    Google Scholar 

  3. Khunjush, F., Dimopoulos, N.J.: Hiding Message Delivery and Reducing Memory Access Latency by providing Direct-to-Cache Transfer during Receive Operations in a Message Passing Environment. ACM SIGARCH Computer Architecture News 34(1), 41–48 (2006)

    Article  Google Scholar 

  4. Afsahi, A., Dimopoulos, N.J.: Architectural Extensions to Support Efficient Communication Using Message Prediction. In: Proceedings, HPCS2002, pp. 20–27 (2002)

    Google Scholar 

  5. Dubunicki, S., et al.: The Virtual Interface Architecture. IEEE Micro, 66–76 (March-April 1998)

    Google Scholar 

  6. Engblom, J., et al.: Developing Embedded Networked Products using the Simics Full-System Simulator. In: Proceedings PIMRC 2005 (2005)

    Google Scholar 

  7. MPICH-A Portable Implementation of MPI: available at http://www-unix.mcs.anl.gov/mpi/mpich1/

  8. MVICH: MPI for Virtual Interface Architecture, http://www.nersc.gov/research/FTG/mvich/index.html

  9. Bailey, D., et al.: The NAS Parallel Benchmarks 2.0: Report NAS-95-020. Nasa Ames Research Center (1995)

    Google Scholar 

  10. Worley, P., Foster, I.: Parallel Spectral Transform Shallow Water Model: A Runtime-tunable parallel benchmark code. In: Proceedings of the Scalable High Performance Computing Conference, pp. 207–214 (1994)

    Google Scholar 

  11. Austin, T., et al.: SimpleScalar: an infrastructure for computer system modeling. IEEE Computer 35(2), 59–67 (2002)

    Google Scholar 

  12. Boden, N., et al.: Myrinet: A Gigabit-per-Second Local Area Network. IEEE Micro (1995)

    Google Scholar 

  13. InfiniBand Trade Association: InfiniBand Architecture Specification, http://www.infinibandta.org

  14. Dubnicki, C., et al.: VMMC-2: Efficient Support for Reliable, Connection-Oriented Communication. In: Proceedings of the Hot Interconnect 1997 (1997)

    Google Scholar 

  15. Rodrigues, S., et al.: High-Performance Local Area Communication with Fast Sockets. In: USENIX 1997 (1997)

    Google Scholar 

  16. Basu, A., Welsh, M., Eicken, T.V.: Incorporating Memory Management into User-Level Network Interface. Hot Interconnects V (1997)

    Google Scholar 

  17. Banikazemi, M., et al.: MPI-LAPI: An Efficient Implementation of MPI for IBM RS/6000 SP Systems. IEEE Trans. Parallel Distri. Systems 12(10), 1081–1093 (2001)

    Article  Google Scholar 

  18. Chu, H.: Zero-copy TCP in Solaris. In: Proceedings of the USENIX Annual Technical Conference, pp. 253–263 (1996)

    Google Scholar 

  19. Alacritech, Inc.: Allacritech / SLIC technology overview, http://www.alacritech.com/html/tech_review.html

  20. Binkert, N.L., et al.: Performance Analysis of System Overheads in TCP/IP Workloads. In: Malyshkin, V. (ed.) PaCT 2005. LNCS, vol. 3606, Springer, Heidelberg (2005)

    Google Scholar 

  21. Huggahalli, R., Iyer, R., Tetrick, S.: Direct Cache Access for High Bandwidth Network I/O. In: Proceedings, ISCA-32, pp. 50–59 (2005)

    Google Scholar 

  22. Lauritzen, K., et al.: Intel I/O acceleration technology improves network performance, reliability and efficiently. Technology@Intel magazine (2005), http://www.intel.com/technology/magazine/communications/Intel-IOAT-0305.pdf

  23. RDMA Consortium: http://www.rdmaconsortium.org/

  24. Acacio, M.E., et al.: Owner Prediction for Accelerating Cache-to-Cache Transfers in a cc-NUMA Architecture. In: Proceedings, SC 2002 (2002)

    Google Scholar 

  25. Kim, J., Lilja, D.J.: Characterization of Communication Patterns in Message-Passing Parallel Scientific Application Programs. In: Proceedings of the Workshop on Communication, Architecture, and Applications for Network-based Parallel Computing, HPCA-4, pp. 202–216 (1998)

    Google Scholar 

  26. Afsahi, A., Dimopoulos, N.J.: Efficient Communication Using Message Prediction for Cluster of Multiprocessors. In: Falsafi, B. (ed.) CANPC 2000. LNCS, vol. 1797, pp. 162–178. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  27. M-VIA: Virtual Interface Architecture for Linux (2001), Was available at http://www.nserc.gov/research/FTG/via/

  28. Bryant, R.E., O’Hallaron, D.R.: Computer Systems: A Programmer’s Perspective. Prentice-Hall, Englewood Cliffs (2003)

    Google Scholar 

  29. Cappelo, F., Etiemble, D.: MPI versus MPI+OpenMP on the IBM SP for the NAS Benchmarks. In: Reich, S., Anderson, K.M. (eds.) Open Hypermedia Systems and Structural Computing. LNCS, vol. 1903, Springer, Heidelberg (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Ivan Stojmenovic Ruppa K. Thulasiram Laurence T. Yang Weijia Jia Minyi Guo Rodrigo Fernandes de Mello

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Khunjush, F., Dimopoulos, N.J. (2007). Comparing Direct-to-Cache Transfer Policies to TCP/IP and M-VIA During Receive Operations in MPI Environments. In: Stojmenovic, I., Thulasiram, R.K., Yang, L.T., Jia, W., Guo, M., de Mello, R.F. (eds) Parallel and Distributed Processing and Applications. ISPA 2007. Lecture Notes in Computer Science, vol 4742. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74742-0_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-74742-0_21

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-74741-3

  • Online ISBN: 978-3-540-74742-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics