Abstract
Cenju-4 is a parallel computer designed and manufactured by NEC Corp. Cenju-4 supports two memory architectures: distributed memory with user-level message passing communication and distributed shared memory with cache-coherent non-uniform memory access (cc-NUMA) feature. The Cenju-4 system consists of from 8 to 1024 nodes connected by a multistage network which has multicast, synchronization, and gather functions. Each node has a MIPS R10000 processor with up to 512 Mbyte main memory. This paper describes the architecture of Cenju-4, especially its multistage network and network interface. In addition, performance results are presented for message passing communication.
Preview
Unable to display preview. Download preview PDF.
References
G.A. Abandah and E.S. Davidson, “Effects of Architectural and Technological Advances on the HP/Convex Exemplar's Memory and Communication Performance,” Proc. 25th Ann. Int'l. Symp. on Comp. Arch., Jun. 1998, pp. 318–329.
C.-M. Chiang and L.M. Ni, “Multi-Address Encoding for Multicast,” Proc. of the Parallel Computer Routing and Communication Workshop, May 1994, pp. 146–160.
Y. Kanoh, K. Konishi, C. Howson, Y. Takano, and T. Maruyama, “User Level Communication on Cenju-3,” Hot Interconnects III, Aug. 1995.
N. Koike, “NEC Cenju-3: A Microprocessor-Based Parallel Computer,” Proc. 8th Int'l. Parallel Processing Symposium, Apr. 1994, pp.396–401.
J. Kuskin, D. Ofelt, M. Heinrich, J. Heinlein, R. Simoni, K. Gharachorloo, J. Chapin, D. Nakahira, J. Baxter, M. Horwitz, A. Gupta, M. Rosenblum, and J. Hennessy, “The Stanford FLASH Multiprocessor,” Proc. 21st Ann. Int'l Symp. Comp. Arch., Apr. 1994, pp. 302–313.
J. Laudon and D. Lenoski, “The SGI Origin:A ccNUMA Highly Scalable Server,” Proc. 24th Ann. Int'l. Symp. on Comp. Arch., Jun. 1997, pp.241–251.
J.M. Mellor-Crummey and M.L. Scott, “Algorithms for Scalable Synchronization on Shared-Memory Multiprocessors,” ACM Trans. on Computer Systems, Vol. 9, No. 1, Feb. 1991, pp. 21–65.
T. Nakata, Y. Kanoh, K. Tatsukawa, S. Yanagida, N. Nishi, and H. Takayama, “Architecture and the Software Environment of Parallel Computer Cenju-4,” NEC Research & Development, Vol. 39, No. 4, Oct. 1998, pp. 385–390.
S.L. Scott, “Synchronization and Communication in the T3E Multiprocessor,” ASPLOS-VII, Sep. 1996, pp. 26–36.
R. Sivaram, D.K. Panda, and C.B. Stunkel, “Efficient Broadcast and Multicast on Multistage Interconnection Networks Using Multiport Encoding,” IEEE Trans. on Parallel and Distributed Systems, Vol. 9, No. 10, Oct. 1998, pp. 1004–1028.
C.B. Stunkel, R. Sivaram, and D.K. Panda, “Implementing Multidestination Worms in Switch-Based Parallel Systems: Architectural Alternatives and their Impact,” Proc. of the 24th Ann. Int'l Symp. on Comp. Arch. Jun. 1997, pp. 50–61.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kanoh, Y., Nakamura, M., Hirose, T., Hosomi, T., Takayama, H., Nakata, T. (1999). Message Passing Communication in a parallel computer Cenju-4. In: Polychronopoulos, C., Fukuda, K.J.A., Tomita, S. (eds) High Performance Computing. ISHPC 1999. Lecture Notes in Computer Science, vol 1615. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0094911
Download citation
DOI: https://doi.org/10.1007/BFb0094911
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65969-3
Online ISBN: 978-3-540-48821-7
eBook Packages: Springer Book Archive