Skip to main content

Experience in Developing an Open Source Scalable Software Infrastructure in Japan

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6017))

Abstract

The Scalable Software Infrastructure for Scientific Computing (SSI) Project was initiated in November 2002, as a five year national project in Japan, for the purpose of constructing a scalable software infrastructure to replace the existing implementations of parallel algorithms in individual scientific fields. The project covered the following four areas: iterative solvers for linear systems, fast integral transforms, their effective implementation for high performance computers of various types, and joint studies with institutes and computer vendors, in order to evaluate the developed libraries for advanced computing environments.

An object-oriented programming model was adopted to enable users to write their parallel codes by just combining elementary mathematical operations. Implemented algorithms are selected from the viewpoint of scalability on massively parallel computing environments. The libraries are freely available via the Internet, and intended to be improved by the feedback from users. Since the first announcement in September 2005, the codes have been downloaded and evaluated by thousands of users at more than 140 organizations around the world.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Nishida, A.: SSI: Overview of simulation software infrastructure for large scale scientific applications (in Japanese). Technical Report 2004-HPC-098, IPSJ (2004)

    Google Scholar 

  2. Nishida, A., Suda, R., Hasegawa, H., Nakajima, K., Takahashi, D., Kotakemori, H., Kajiyama, T., Nukada, A., Fujii, A., Hourai, Y., Zhang, S.L., Abe, K., Itoh, S., Sogabe, T.: The Scalable Software Infrastructure for Scientific Computing Project. Kyushu University (2009), http://www.ssisc.org/

  3. Tuminaro, R.S., Heroux, M., Hutchinson, S.A., Shadid, J.N.: Official Aztec User’s Guide, Version 2.1. Technical Report SAND99-8801J, Sandia National Laboratories (1999)

    Google Scholar 

  4. Wu, K., Milne, B.: A survey of packages for large linear systems. Technical Report LBNL-45446, Lawrence Berkeley National Laboratory (2000)

    Google Scholar 

  5. Balay, S., Buschelman, K., Eijkhout, V., Gropp, W.D., Kaushik, D., Knepley, M.G., McInnes, L.C., Smith, B.F., Zhang, H.: PETSc Users Manual. Technical Report ANL-95/11, Argonne National Laboratory (2004)

    Google Scholar 

  6. Dongarra, J., Ltaief, H.: Freely available software for linear algebra on the Web (2009), http://www.netlib.org/utk/people/JackDongarra/la-sw.html

  7. Frigo, M., Johnson, S.G.: The design and implementation of FFTW3. Proceedings of the IEEE 93(2), 216–231 (2005)

    Article  Google Scholar 

  8. Moler, C.: Design of an interactive matrix calculator. In: AFIPS National Computer Conference. AFIPS Conference Proceedings, vol. 49, pp. 363–368. AFIPS Press (1980)

    Google Scholar 

  9. Kennedy, K., Broom, B., Chauhan, A., Fowler, R., Garvin, J., Koelbel, C., McCosh, C., Mellor-Crummey, J.: Telescoping Languages: A System for Automatic Generation of Domain Languages. Proceedings of the IEEE 93, 387–408 (2005)

    Article  Google Scholar 

  10. Kotakemori, H., Hasegawa, H., Nishida, A.: Performance Evaluation of a Parallel Iterative Method Library using OpenMP. In: Proceedings of the 8th International Conference on High Performance Computing in Asia Pacific Region, pp. 432–436 (2005)

    Google Scholar 

  11. Kotakemori, H., Hasegawa, H., Kajiyama, T., Nukada, A., Suda, R., Nishida, A.: Performance evaluation of parallel sparse matrix-vector products on SGI Altix3700. In: Mueller, M.S., Chapman, B.M., de Supinski, B.R., Malony, A.D., Voss, M. (eds.) IWOMP 2005 and IWOMP 2006. LNCS, vol. 4315, pp. 153–163. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  12. Kotakemori, H., Fujii, A., Hasegawa, H., Nishida, A.: Implementation of Fast Quad Precision Operation and Acceleration with SSE2 for Literative Solver Library (in Japanese). IPSJ Transactions on Advanced Computing Systems 1(1), 73–84 (2008)

    Google Scholar 

  13. Nukada, A.: FFTSS: a High Performance Fast Fourier Transform Library. In: Proceedings of the 2006 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. III, pp. 980–983. IEEE Computer Society Press, Washington (2006)

    Google Scholar 

  14. Nukada, A., Takahashi, D., Suda, R., Nishida, A.: High Performance FFT on SGI Altix 3700. In: Perrott, R., Chapman, B.M., Subhlok, J., de Mello, R.F., Yang, L.T. (eds.) HPCC 2007. LNCS, vol. 4782, pp. 396–407. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  15. Nukada, A., Hourai, Y., Nishida, A., Akiyama, Y.: High Performance 3D Convolution for Protein Docking on IBM Blue Gene. In: Stojmenovic, I., Thulasiram, R.K., Yang, L.T., Jia, W., Guo, M., de Mello, R.F. (eds.) ISPA 2007. LNCS, vol. 4742, pp. 958–969. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  16. Kajiyama, T., Nukada, A., Hasegawa, H., Suda, R., Nishida, A.: LAPACK in SILC: Use of a Flexible Application Framework for Matrix Computation Libraries. In: Proceedings of the 8th International Conference on High Performance Computing in Asia Pacific Region, pp. 205–212. IEEE, Washington (2005)

    Chapter  Google Scholar 

  17. Kajiyama, T., Nukada, A., Hasegawa, H., Suda, R., Nishida, A.: SILC: A Flexible and Environment Independent Interface for Matrix Computation Libraries. In: Wyrzykowski, R., Dongarra, J., Meyer, N., Waśniewski, J. (eds.) PPAM 2005. LNCS, vol. 3911, pp. 928–935. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  18. Kajiyama, T., Nukada, A., Suda, R., Hasegawa, H., Nishida, A.: A Performance Evaluation Model for the SILC Matrix Computation Framework. In: Proceedings of the IFIP International Conference on Network and Parallel Computing, pp. 93–103. The University of Tokyo, Tokyo (2006)

    Google Scholar 

  19. Kajiyama, T., Nukada, A., Suda, R., Hasegawa, H., Nishida, A.: Distributed SILC: An Easy-to-Use Interface for MPI-based Parallel Matrix Computation Libraries. In: Kågström, B., Elmroth, E., Dongarra, J., Waśniewski, J. (eds.) PARA 2006. LNCS, vol. 4699, pp. 860–870. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  20. Kajiyama, T., Nukada, A., Suda, R., Hasegawa, H., Nishida, A.: Cloth simulation in the SILC matrix computation framework: A case study. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Wasniewski, J. (eds.) PPAM 2007. LNCS, vol. 4967, pp. 1086–1095. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  21. Sogabe, T., Sugihara, M., Zhang, S.: An Extension of the Conjugate Residual Method for Solving Nonsymmetric Linear Systems(in Japanese). Transactions of the Japan Society for Industrial and Applied Mathematics 15(3), 445–460 (2005)

    Google Scholar 

  22. Abe, K., Sogabe, T., Fujino, S., Zhang, S.: A Product-type Krylov Subspace Method Based on Conjugate Residual Method for Nonsymmetric Coefficient Matrices (in Japanese). IPSJ Transactions on Advanced Computing Systems 48(SIG8(ACS18)), 11–21 (2007)

    Google Scholar 

  23. Fujino, S., Fujiwara, M., Yoshida, M.: BiCGSafe method based on minimization of associate residual (in Japanese). Transactions of JSCES 8(20050028), 145–152 (2005), http://save.k.u-tokyo.ac.jp/jsces/trans/trans2005/No20050028.pdf

    Google Scholar 

  24. Fujino, S., Onoue, Y.: Estimation of BiCRSafe method based on residual of BiCR method (in Japanese). Technical Report 2007-HPC-111, IPSJ (2007)

    Google Scholar 

  25. Saad, Y.: A Flexible Inner-outer Preconditioned GMRES Algorithm. SIAM J. Sci. Stat. Comput. 14, 461–469 (1993)

    Article  MATH  MathSciNet  Google Scholar 

  26. Soonerveld, P., van Gijzen, M.B.: IDR(s): a family of simple and fast algorithms for solving large nonsymmetric systems of linear equations. SIAM J. Sci. Comput. 31, 1035–1062 (2008)

    Article  MathSciNet  Google Scholar 

  27. Greenbaum, A.: Iterative Methods for Solving Linear Systems. SIAM, Philadelphia (1997)

    MATH  Google Scholar 

  28. Knyazev, A.V.: Toward the Optimal Preconditioned Eigensolver: Locally Optimal Block Preconditioned Conjugate Gradient Method. SIAM J. Sci. Comput. 23(2), 517–541 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  29. Nishida, A.: A Short Survey of Applications and Evaluations of Preconditioned Conjugate Gradient Method for Large Scale Eigenvalue Problems (in Japanese). In: Proceedings of the 2003 Annual Conference, JSIAM, Tokyo, pp. 326–327 (2003)

    Google Scholar 

  30. Suetomi, E., Sekimoto, H.: Conjugate gradient like methods and their application to eigenvalue problems for neutron diffusion equation. Ann. Nucl. Energy 18(4), 205–227 (1991)

    Article  Google Scholar 

  31. Saad, Y.: ILUT: a dual threshold incomplete LU factorization. Numerical linear algebra with applications 1(4), 387–402 (1994)

    Article  MATH  MathSciNet  Google Scholar 

  32. Li, N., Suchomel, B., Osei-Kuffuor, D., Saad, Y.: ITSOL: ITERATIVE SOLVERS package. In: University of Minnesota (2008), http://www-users.cs.umn.edu/~saad/software/ITSOL/

  33. Li, N., Saad, Y., Chow, E.: Crout version of ILU for general sparse matrices. SIAM J. Sci. Comput. 25, 716–728 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  34. Kohno, T., Kotakemori, H., Niki, H.: Improving the Modified Gauss-Seidel Method for Z-matrices. Linear Algebra and its Applications 267, 113–123 (1997)

    MATH  MathSciNet  Google Scholar 

  35. Fujii, A., Nishida, A., Oyanagi, Y.: Evaluation of Parallel Aggregate Creation Orders: Smoothed Aggregation Algebraic Multigrid Method, pp. 99–122. Springer, Berlin (2005)

    Google Scholar 

  36. Abe, K., Zhang, S., Hasegawa, H., Himeno, R.: A SOR-base Variable Preconditioned CGR Method (in Japanese). Trans. JSIAM 11(4), 157–170 (2001)

    Google Scholar 

  37. Bridson, R., Tang, W.P.: Refining an approximate inverse. J. Comput. Appl. Math. 123, 293–306 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  38. Saad, Y.: SPARSKIT: a basic tool kit for sparse matrix computations, version 2. University of Minnesota (1994), http://www.cs.umn.edu/saad/software/

  39. Nishida, A., Oyanagi, Y.: Performance Evaluation of Low Level Multithreaded BLAS Kernels on Intel Processor based cc-NUMA Systems. In: Veidenbaum, A., Joe, K., Amano, H., Aiso, H. (eds.) ISHPC 2003. LNCS, vol. 2858, pp. 500–510. Springer, Heidelberg (2003)

    Google Scholar 

  40. Duhamel, P., Hollmann, H.: Split-Radix FFT Algorithm. Electron. Lett. 20, 14–16 (1984)

    Article  Google Scholar 

  41. Linzer, E.N., Feig, E.: Implementation of Efficient FFT Algorithms on Fused Multiply-Add Architectures. IEEE Trans. Signal Processing 41, 93–107 (1993)

    Article  MATH  Google Scholar 

  42. Goedecker, S.: Fast Radix 2,3,4 and 5 Kernels for Fast Fourier Transformations on Computers with Overlapping Multiply-Add Instructions. SIAM J. Sci. Comput. 18, 1605–1611 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  43. Karner, H., Auer, M., Ueberhuber, C.W.: Multiply-Add Optimized FFT Kernels. Math. Models and Methods in Appl. Sci. 11, 105–117 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  44. Wait, C.D.: IBM PowerPC 440 FPU with complex arithmetic extensions. IBM Journal of Research and Development 49(2/3), 249–254 (2005)

    Article  Google Scholar 

  45. Bailey, D.H.: A fortran-90 double-double library. In: Lawrence Berkeley National Laboratory (2008), http://www.nersc.gov/dhbailey/mpdist/mpdist.html

  46. Hida, Y., Li, X.S., Bailey, D.H.: Algorithms for quad-double precision floating point arithmetic. In: Proceedings of the 15th Symposium on Computer Arithmetic, pp. 155–162. IEEE, Washington (2001)

    Google Scholar 

  47. Dekker, T.: A floating-point technique for extending the available precision. Numerische Mathematik 18, 224–242 (1971)

    Article  MATH  MathSciNet  Google Scholar 

  48. Knuth, D.E.: The Art of Computer Programming: Seminumerical Algorithms, vol. 2. Addison-Wesley, New Jersey (1969)

    MATH  Google Scholar 

  49. Bailey, D.H.: High-Precision Floating-Point Arithmetic in Scientific Computation. Computing in Science and Engineering 7, 54–61 (2005)

    Article  Google Scholar 

  50. Intel: Intel Fortran Compiler User’s Guide Vol I. Intel (2009)

    Google Scholar 

  51. Barrett, R., Berry, M., Chan, T.F., Demmel, J., Donato, J., Dongarra, J., Eijkhout, V., Pozo, R., Romine, C., der Vorst, H.V.: Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods, 2nd edn. SIAM, Philadelphia (1994)

    Google Scholar 

  52. Bai, Z., Demmel, J., Dongarra, J., Ruhe, A., van der Vorst, H. (eds.): Templates for the Solution of Algebraic Eigenvalue Problems. SIAM, Philadelphia (2000)

    MATH  Google Scholar 

  53. Lehoucq, R.B., Sorensen, D.C., Yang, C.: ARPACK Users’ Guide: Solution of Large-scale Eigenvalue Problems with implicitly-restarted Arnoldi Methods. SIAM, Philadelphia (1998)

    Google Scholar 

  54. Bramley, R., Wang, X.: SPLIB: A library of iterative methods for sparse linear system. Technical report, Indiana University–Bloomington (1995)

    Google Scholar 

  55. Boisvert, R.F., Pozo, R., Remington, K., Barrett, R., Dongarra, J.J.: The Matrix Market: A web resource for test matrix collections, pp. 125–137. Chapman & Hall, London (1997)

    Google Scholar 

  56. Casanova, H., Dongarra, J.: NetSolve: A Network Server for Solving Computational Science Problems. In: The International Journal of Supercomputer Applications and High Performance Computing, pp. 212–223. MIT Press, Boston (1995)

    Google Scholar 

  57. Sato, M., Nakada, H., Sekiguchi, S., Matsuoka, S., Nagashima, U., Takagi, H.: Ninf: A network based information library for global world-wide computing infrastructure (1997)

    Google Scholar 

  58. Rose, L.D., Padua, D.: Techniques for the translation of MATLAB programs into Fortran 90. ACM Transactions on Programming Languages and Systems 21, 286–323 (1999)

    Article  Google Scholar 

  59. Kawabata, H., Suzuki, M., Kitamura, T.: A MATLAB-based code generator for sparse matrix computations. In: Chin, W.-N. (ed.) APLAS 2004. LNCS, vol. 3302, pp. 280–295. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  60. MathWorks, I.: Matlab. MathWorks, Inc (2005), http://www.mathworks.com/

  61. Luszczek, P., Dongarra, J.: Design of interactive environment for numerically intensive parallel linear algebra calculations. In: Bubak, M., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2004. LNCS, vol. 3039, pp. 270–277. Springer, Heidelberg (2004)

    Google Scholar 

  62. Fujii, A., Suda, R., Nishida, A., Oyanagi, Y.: Evaluation of Asynchronous Iterative Method for Sparse Matrix Solver. In: Proceedings of the Second international Workshop on Automatic Performance Tuning, pp. 43–51. The University of Tokyo, Tokyo (2007)

    Google Scholar 

  63. Kajiyama, T., Nukada, A., Suda, R., Hasegawa, H., Nishida, A.: Toward Automatic Performance Tuning for Numerical Simulations in the SILC Matrix Computation Framework. In: Proceedings of the Second international Workshop on Automatic Performance Tuning, pp. 81–90. The University of Tokyo, Tokyo (2007)

    Google Scholar 

  64. Nishida, A.: Building Cost Effective High Performance Computing Environment via PCI Express. In: Proceedings of the 2006 International Conference on Parallel Processing Workshops, pp. 519–526. IEEE, Washington (2006)

    Google Scholar 

  65. Fujii, A., Suda, R., Nishida, A.: Parallel Matrix Distribution Library for Sparse Matrix Solvers. In: Proceedings of the 8th International Conference on High Performance Computing in Asia Pacific Region, pp. 213–219. IEEE, Washington (2005)

    Chapter  Google Scholar 

  66. Hourai, Y., Nishida, A., Oyanagi, Y.: Network-aware Data Mapping on Parallel Molecular Dynamics. In: Proceedings of 11th International Conference on Parallel and Distributed Systems, pp. 126–132. IEEE, Washington (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Nishida, A. (2010). Experience in Developing an Open Source Scalable Software Infrastructure in Japan. In: Taniar, D., Gervasi, O., Murgante, B., Pardede, E., Apduhan, B.O. (eds) Computational Science and Its Applications – ICCSA 2010. ICCSA 2010. Lecture Notes in Computer Science, vol 6017. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12165-4_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12165-4_36

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12164-7

  • Online ISBN: 978-3-642-12165-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics