The Experience in Designing and Evaluating the High Performance Cluster Netuno

Silva, Gabriel P.; Correa, Juliana; Bentes, Cristiana; Guedes, Sergio; Gabioux, Mariela

doi:10.1007/s10766-012-0224-7

The Experience in Designing and Evaluating the High Performance Cluster Netuno

Published: 07 October 2012

Volume 42, pages 265–286, (2014)
Cite this article

International Journal of Parallel Programming Aims and scope Submit manuscript

Gabriel P. Silva¹,
Juliana Correa¹,
Cristiana Bentes²,
Sergio Guedes³ &
…
Mariela Gabioux⁴

174 Accesses
Explore all metrics

Abstract

This paper presents the Netuno supercomputer, a large-scale cluster installed at Federal University of Rio de Janeiro in Brazil. A detailed performance evaluation of Netuno is presented, depicting its computational and I/O performance, as well as the results for two real-world applications. Since building a high- performance cluster for running a wide range of applications is a non-trivial task, some lessons learned from assembling and operating this cluster, such as the excellent performance of the OpenMPI library, and the relevance of employing an efficient parallel file system over the traditional NFS system, can be useful knowledge to support the design of new systems. Currently, Netuno is being heavily used to run large scale simulations in the areas of ocean modeling, meteorology, engineering, physics, and geophysics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Sunway TaihuLight supercomputer: system and applications

Article 21 June 2016

Simulation of the world ocean climate with a massively parallel numerical model

Article 01 July 2015

The Productivity, Portability and Performance of OpenMP 4.5 for Scientific Applications Targeting Intel CPUs, IBM CPUs, and NVIDIA GPUs

References

Adachi, T., Shida, N., Miura, K., Sumimoto, S., Uno, A., Kurokawa, M., Shoji, F., Yokokawa, M.: The design of ultra scalable mpi collective communication on the k computer. Comput. Sci. Res. Dev. (2012). doi:10.1007/s00450-012-0211-7
Alam, S., Barrett, R., Bast, M., Fahey, M.R., Kuehn, J., McCurdy, C., Rogers, J., Roth, P., Sankaran, R., Vetter, J.S., Worley, P., Yu, W.: Early evaluation of ibm bluegene/p. In SC ’08: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, pp. 1–12. IEEE Press, Piscataway, NJ, USA, (2008)
Alam, S.R., Kuehn, J.A., Barrett, R.F., Larkin, J.M., Fahey, M.R., Sankaran, R., Worley, P.H.: Cray xt4: an early evaluation for petascale scientific simulation. In SC ’07: Proceedings of the 2007 ACM/IEEE Conference on Supercomputing, pp. 1–12. ACM, New York, NY, USA (2007)
Barker, K.J., Davis, K., Hoisie, A., Kerbyson, D.J., Lang, M., Pakin, S., Sancho, J.C.: Entering the petaflop era: the architecture and performance of roadrunner. In SC ’08: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, pp. 1–11. IEEE Press, Piscataway, NJ, USA (2008)
Barker, K.J., Kerbyson, D.J.: A performance model and scalability analysis of the hycom ocean simulation application. In: Parallel and Distributed Computing and Systems Proceedings of the 17th IASTED International Conference (2005)
Bleck R.: An oceanic general circulation model framed in hybrid isopycnic-cartesian. Ocean Model. 37, 55–88 (2002)
Article Google Scholar
Bleck R., Benjamin S.: Regional weather prediction with a model combining terrain-following and isentropic coordinates. part I: model description. Mon. Weather Rev. 121(6), 1770–1785 (1993)
Article Google Scholar
Bleck R., Boudra D.: Initial testing of a numerical ocean circulation model using a hybrid (quasi-isopycnic) vertical coordinate. J. Phys. Oceanogr. 11(6), 755–769 (1981)
Article Google Scholar
Chai, L., Gao, Q., Panda, D.K.: Understanding the impact of multi-core architecture in cluster computing: a case study with intel dual-core system. In: Seventh IEEE International Symposium on Cluster Computing and the Grid—CCGRID 2007, pp. 471–478 (2007)
Chris D.: Building and managing production bioclusters. BIOSILICO 2(5), 208–213 (2004)
Google Scholar
Darling, A.E., Carey, L., Feng, W.C.: The design, implementation, and evaluation of mpiblast. In: Proceedings of ClusterWorld 2003 (2003)
Davis, B., Auyeung, M., Clark, M., Lee, C.,Thomas, M., Palko, J., Varney, R.: Lessons learned building a general purpose cluster for space mission applications. In: IEEE International Conference on Space Mission Challenges for Information Technology (SMC-IT 2006) (2006)
Davis, B., Auyeung, M., Green, G., Lee, C.: Building a high-performance computing cluster using FreeBSD. In: Proceedings of BSDCon 2003, pp. 35–46 (2003)
Dongarra, J.: Frequently asked questions on the linpack benchmark and top500. http://www.netlib.org/utk/people/JackDongarra/faq-linpack.html (2007)
Dunigan, T.H.Jr., Vetter, J.S., Worley, P.H.: Performance evaluation of the sgi altix 3700. In: International Conference on Parallel Processing (ICPP), pp. 231–240 (2005)
Fatoohi, R.: Performance evaluation of the dual-core based sgi altix 4700. In: International Symposium on Computer Architecture and High Performance Computing, pp. 97–104 (2007)
Ginty, K., Tindle, J., Tindle, S.J.: Cluster systems—an open-access design solution. In: ICSE2009 Conference (2009)
Hasegawa, Y., Iwata, J.-I., Tsuji, M., Takahashi, D., Oshiyama, A., Minami, K., Boku, T., Shoji, F., Uno, A., Kurokawa, M., Inoue, H., Miyoshi, I., Yokokawa, M.: First-principles calculations of electron states of a silicon nanowire with 100,000 atoms on the k computer. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’11, pp. 1–11. ACM, New York, NY, USA (2011)
Kistler M., Gunnels J., Brokenshire D., Benton B.: Programming the linpack benchmark for roadrunner. IBM J. Res. Dev. 53(5), 736–746 (2009)
Article Google Scholar
Koop, M.J., Huang, W., Vishnu, A., Panda, D.K.: Memory scalability evaluation of the next-generation intel bensley platform with infiniband. In: Proceedings of the 14th IEEE Symposium on High-Performance Interconnects, pp. 52–60. IEEE Computer Society (2006)
Krioukov, A., Mohan, P., Alspaugh, S., Keys, L., Culler, D., Napsac, R.K.: The design and implementation of a power proportional web cluster. In: Proceedings of the 1st ACM SIGCOMM Workshop on Green Networking. ACM, New York, NY, USA (2010)
Los Alamos National Labs. Mpi-io test. http://public.lanl.gov/jnunez/benchmarks/mpiiotest.htm (2011)
Leach, C.L., Oppe, T.C., Ward, W.A.Jr., Campbell, R.L.Jr.: Cwo-based hpcmp systems assessment using hycom and wrf. In: Proceedings of the 2005 Users Group Conference, pp. 356–359 (2005)
Norcott, W.D.: Iozone filesystem benchmark. http://www.iozone.org (2011)
Oliker, L., Canning, A., Carter, J., Iancu, C., Lijewski, M., Kamil, S., Shalf, J., Shan, J., Strohmaier, E., Ethier, S., Goodale, T.: Scientific application performance on candidate petascale platforms. In: Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS), pp. 1–12 (2007)
Panetta J., Filho P.R.P.S., Filho C.A.C., da Motta F.M.R., Pinheiro S.S., Junior I.P., Rosa A.L.R., Monnerat L.R., Carneiro L.T., Albrecht C.H.B.: Computational characteristics of production seismic migration and its performance on novel processor architectures. Int. Symp. Comput. Archit. High Perform. Comput. 0, 11–18 (2007)
Google Scholar
Petitet, A., Whaley, R.C., Dongarra, J., Cleary, A.: Hpl—a portable implementation of the high-performance linpack benchmark for distributed-memory computers. http://www.netlib.org/benchmark/hpl (2008)
Shan, H., Shalf, J.: Using ior to analyze the i/o performance for hpc platforms. In: Cray User Gruop Conference, Seattle, WA, USA (2007)
Shipman, G.M., Woodall, T.S., Graham, R.L., Maccabe, A.B., Bridges, P.G. Infiniband Scalability in OpenMPI. In: Proceedings of IEEE Parallel and Distributed Processing Symposium (IPDPS) (2006)
Silva, G.P., Bentes, C., Silva, V., Guedes, S.: Arquitetura e avaliacao do cluster de alto desempenho netuno. In: X Simposio em Sistemas Computacionais—WSCAD-SSC 2009, pp. 1–10. Sao Paulo, Brazil (2009)
Sterling T., Becker D.J., Salmon J., Savarese D.F.: How to Build a Beowulf: A Guide to the Implementation and Application of PC Clusters. MIT Press, Cambridge, MA (1999)
Google Scholar
Szalay, A.S, Bell, G., Vandenberg, J., Wonders, A., Burns, R., Fay, D., Heasley, J., Hey, T., Nieto-SantiSteban, M., Thakar, A. (2009) In: van Ingen, C., Wilton, R. (eds.) Graywulf: Scalable clustered architecture for data intensive computing. Proceedings of the 42nd Hawaii International Conference on System Sciences, HICSS ’09, pp. 1–10. IEEE Computer Society, Washington, DC, USA (2009)
Tamaoki, J.N., Bonatti, J.P., Panetta, J., Tomita, S.: Parallelizing cptecs general circulation model. In: Proceedings of the 11th Symposium on Computer Architecture and High Performance Computing SBAC-PAD, pp. 93–100 (1999)
Vetter J.S., Alam S.R., Dunigan T.H., Fahey M.R., Roth P.C., Worley P.H.: Early evaluation of the cray xt3. Parallel Distrib. Process. Symp. Int. 0, 43 (2006)
Google Scholar
Vinter, B.: Design and implementation of a 512 cpu cluster for general purpose supercomputing. In: Peters, F.J., Joubert, G.R., Nagel, W.E., Walter, W.V. (eds.) Parallel Computing—Software Technology, Algorithms, Architectures and Applications, vol. 13 of Advances in Parallel Computing, pp. 871–877. North-Holland (2004)
Worley, P., Barrett, R., Kuehn, J.: Early evaluation of the cray xt5. In: Proceedings of the 51st Cray User Group Conference, pp. 1–12 (2009)
Yu, H., Sahoo, R.K., Howson, C., Almsi, G., Castanos, J.G., Gupta, M., Moreira, J.E., Parker, J.J.: High performance file i/o for the blue gene/l supercomputer. In: The Twelfth International Symposium on High-Performance Computer Architecture, pp. 187–196 (2006)

Download references

Author information

Authors and Affiliations

DCC/IM-UFRJ, PO Box 68530, Rio de Janeiro, RJ, 21941-909, Brazil
Gabriel P. Silva & Juliana Correa
DESC-UERJ, R. S. Francisco Xavier 524, Rio de Janeiro, RJ, 20550-900, Brazil
Cristiana Bentes
NCE-UFRJ, PO Box 2324, Rio de Janeiro, RJ, 20010-974, Brazil
Sergio Guedes
COPPE-Oceano-UFRJ, PO Box 68508, Rio de Janeiro, RJ, 21945-970, Brazil
Mariela Gabioux

Authors

Gabriel P. Silva
View author publications
You can also search for this author in PubMed Google Scholar
Juliana Correa
View author publications
You can also search for this author in PubMed Google Scholar
Cristiana Bentes
View author publications
You can also search for this author in PubMed Google Scholar
Sergio Guedes
View author publications
You can also search for this author in PubMed Google Scholar
Mariela Gabioux
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cristiana Bentes.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Silva, G.P., Correa, J., Bentes, C. et al. The Experience in Designing and Evaluating the High Performance Cluster Netuno. Int J Parallel Prog 42, 265–286 (2014). https://doi.org/10.1007/s10766-012-0224-7

Download citation

Received: 30 November 2011
Accepted: 21 September 2012
Published: 07 October 2012
Issue Date: April 2014
DOI: https://doi.org/10.1007/s10766-012-0224-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The Experience in Designing and Evaluating the High Performance Cluster Netuno

Abstract

Access this article

Similar content being viewed by others

The Sunway TaihuLight supercomputer: system and applications

Simulation of the world ocean climate with a massively parallel numerical model

The Productivity, Portability and Performance of OpenMP 4.5 for Scientific Applications Targeting Intel CPUs, IBM CPUs, and NVIDIA GPUs

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The Experience in Designing and Evaluating the High Performance Cluster Netuno

Abstract

Access this article

Similar content being viewed by others

The Sunway TaihuLight supercomputer: system and applications

Simulation of the world ocean climate with a massively parallel numerical model

The Productivity, Portability and Performance of OpenMP 4.5 for Scientific Applications Targeting Intel CPUs, IBM CPUs, and NVIDIA GPUs

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation