Skip to main content
Log in

Study on Parallel Computing

  • Architechture
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

In this paper, we present a general survey on parallel computing. The main contents include parallel computer system which is the hardware platform of parallel computing, parallel algorithm which is the theoretical base of parallel computing, parallel programming which is the software support of parallel computing. After that, we also introduce some parallel applications and enabling technologies. We argue that parallel computing research should form an integrated methodology of “architecture — algorithm — programming — application”. Only in this way, parallel computing research becomes continuous development and more realistic.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Chen G. Parallel Algorithm of Sorting and Selection. University of Science and Technology of China Press, 1990.

  2. Chen G, Chen L. Computational Theory and Parallel Algorithms of VLSI. Univ. Science and Technology of China Press, 1991.

  3. Tang C et al. Parallel Graph Algorithm. University of Science and Technology of China Press, 1991.

  4. Chen G. Parallel Computing — Architecture, Algorithm, Programming. 2nd Edition, Higher Education Press, 2003.

  5. Chen G, Wu J et al. Parallel Computer Architectures. Higher Education Press, 2002.

  6. Chen G. Design and Analysis of Parallel Algorithms. 2nd Edition, Higher Education Press, 2002.

  7. Chen G, An H et al. Parallel Algorithms Practice. Higher Education Press, 2003.

  8. Blelloch G E, Maggs B M. Parallel algorithms. ACM Computing Surveys, 1996, 28(1): 51–54.

    Article  Google Scholar 

  9. Fortune S, Wyllie J C. Parallelism in random access machines. In Conference Record of the 10th Annual ACM Symp. Theory of Computing, San Diego, California, 1978, pp. 114–118.

  10. Goldschlager L M. A universial interconnection pattern for parallel computers. J. the ACM, 1982, 29(4): 1073–1086.

    Article  MATH  MathSciNet  Google Scholar 

  11. Cole R, Zajicek O. APRAM: Incorporating asynchrony into the PRAM model. In Proc. 1st Annual ACM Symp. Parallel Algorithms and Architectures, Santa Fe, New Mexico, 1989, pp. 158–168.

  12. Gibbons P, Matias Y, Ramachandran V. The QRQW PRAM: Accounting for contention in parallel algorithms. In Proc. the SPAA’94, Cape May, New Jersey, 1994, pp. 638–648.

  13. Aggarwal A, Chandra A, Snir M. On communication latencies in PRAM computations. In Proc. SPAA’89, Santa Fe, New Mexico, 1989, pp. 11–21.

  14. Valiant L. A bridging model for parallel computation. Communications of the ACM, 1990, 33: 103–111.

    Article  Google Scholar 

  15. Culler D, Karp R, Patterson D et al. LogP: Towards a realistic model of parallel computation. In Proc. ASPLOS IV, New York, 1993, pp. 1–12.

  16. Aggarwal A, ALpern B, Chandra A, Snir M. A model for hierarchical memory. In Proc. the 19th Annual ACM Symp. Theory of Computing, Chicago, Illinois, USA, 1987, pp. 305–314.

  17. Aggarwal A, ALpern B, Chandra A, Snir M. Hierarchical memory with block transfer. In Proc. of the 28th Annual IEEE Symp. Foundations of Computer Science, Los Angeles, CA, 1987, pp. 204–216.

  18. Alpern B, Carter L, Feig E, Selker T. The uniform memory hierarchy model of computation. Algorithmica, 1993.

  19. Vitter J, Shriver E. Algorithms for parallel memory II: Hierarchical multilevel memories. Technical Reports, CS–1993–02, Department of Computer Science, Duke University, 1993.

  20. Li Z, Mills P H, Reif J H. Models and resource metrics for parallel and distributed computation. In the 28th Int. Conf. System Sciences (HICSS’95), Hawaii, USA, 1995, pp. 51–61.

  21. Zhang Y. Performance optimizations on parallel numerical software package and study on memory complexity [Dissertation]. Institute of Software, CAS, 2000.

  22. Zhang Y. DRAM(h): A parallel computation model for high performance numerical computing. Chinese Journal of Computers, 2003, 12(26): 1660–1670.

    Google Scholar 

  23. Zhang Y, Sun J, Tang Z, Chi X. Memory complexity in high performance computing. In Proc. the 3rd Int. Conf. High Performance Computing in Asia-Pacific Region, Singapore, 1998, pp. 142–151.

  24. Cameron K, Sun X H. Quantifying locality effect in data access delay: Memory log P. In Proc. the 2003 IEEE Int. Parallel and Distributed Processing Symp., Nice, France, 2003, pp. 212–219.

  25. Gerasoulis A, Yang T. On the granularity and clustering of directed acyclic task graphs. IEEE Trans. Parallel and Distributed Systems, 1993, 4(6): 686–701.

    Article  Google Scholar 

  26. Shirazi B A, Hurson A, Kavi K. Scheduling and Load Balancing in Parallel and Distributed Systems. IEEE Computer Science Press, 1995.

  27. Kwok Y, Ahmed I. Dynamic critical-path scheduling: An effective technique for allocating task graph to multiprocessors. IEEE Trans. Parallel and Distributed Systems, 1996, 7: 506–521.

    Google Scholar 

  28. Topcuoglu H, Hariri S, Min-You W. Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans. Parallel and Distributed Systems, 2002, 13(3): 260–274.

    Article  Google Scholar 

  29. Amdahl G M. Validity of the single-processor approach to achieving large scale computing capabilities. In AFIPS Conference Proc., Atlantic City, New Jersey, 1967, pp. 483–485.

  30. Gustafson J L. Revaluating Amdahl’s law. Communications of the ACM, 1987, 31: 532–533.

    Article  Google Scholar 

  31. Grama A Y, Gupta A, Kumar V. Isoefficiency: Measuring the scalability of parallel algorithms and architectures. IEEE Parallel and Distributed Technology, 1993: 1(3), 12–21.

    Article  Google Scholar 

  32. Sun X, Rover D. Scalability of parallel algorithm-machine combinations. IEEE Trans. Parallel and Distributed System, 1994, 5(6): 599–613.

    Article  Google Scholar 

  33. Zhang X, Yan Y, He K. Latency metric: An experimental method for measuring and evaluating parallel program and architecture scalability. Journal of Parallel and Distributed Computing, 1994, 22(3): 392–410.

    Article  Google Scholar 

  34. Quinn M J. Parallel Programming in C with MPI and OpenMP. McGraw Hill, 2004.

  35. http://www.llnl.gov/computing/tutorials/parallel_comp/

  36. Yao Z, Zheng Q, Chen G. GOOMPI: A generic object oriented message passing interface. In Proc. NPC, 2004, pp. 261–271.

  37. http://www.vcpc.univie.ac.at/information/mirror/HPFF/.

  38. http://www-unix.mcs.anl.gov/mpi/.

  39. http://www-unix.mcs.anl.gov/mpi/mpich/.

  40. http://www.lam-mpi.org/.

  41. http://www.co-array.org/.

  42. http://upc.lbl.gov/.

  43. http://www.mmm.ucar.edu/mm5/.

  44. http://www.wrf-model.org/.

  45. http://www.nas.nasa.gov/Software/NPB/.

  46. http://www.netlib.org/linpack/.

  47. http://www.samss.org.cn.

  48. http://www.netlib.org/benchmark/hpl/.

  49. http://icl.cs.utk.edu/hpcc/.

  50. CFD, http://www.cfd-online.com/.

  51. Ferziger J H, Peric M. Computational Methods for Fluid Dynamics. Springer-Verlag, 1999.

  52. Thompson J F, Soni B K, Weaherill N P (eds.). Handbook of Grid Generation. CRC Press, Boca Raton, FL, 1999.

    MATH  Google Scholar 

  53. Rheinboldt W C. Methods for Solving Systems of Nonlinear Equations. Second Edition, SIAM, Philadelphia, 1998.

  54. Saad Y. Iterative Methods for Sparse Linear Systems. Second Edition, SIAM, Philadelphia, 2003.

  55. Teresco J D. Hierarchical partitioning and dynamic load balancing for scientific computation. In PARA’04 State-of-the-Art in Scientific Computing, Copenhagen, Denmark, 2004.

  56. Schloegel K, Karypis G, Kumar V. Graph Partitioning for High Performance Scientific Simulations. Chapter 18, Sourcebook of Parallel Computing, Dongarra J, Foster I, Fox G et al. (eds.), New York: Morgan Kaufmann Publishers, 2003.

  57. Meiron D, Deiterding R. Load balancing strategies for parallel SAMR algorithms. SURF 2005 technical report, Available at http://scdrm.caltech.edu/publications/cit-asci-tr, 2005.

  58. Sagan H. Space-Filling Curves. New York: Springer-Verlag, 1994.

    MATH  Google Scholar 

  59. Mo Z, Zhang J, Cai Q. Dynamic load balancing for short-range parallel molecular dynamics simulations. Int. J. Computer Math., 2002, 79(2): 165–177.

    Article  MATH  Google Scholar 

  60. Mo Z, Zhang B. Multilevel averaging weight method for dynamic load imbalance problems. Int. J. Computer Math., 2001, 76(4): 463–477.

    MATH  MathSciNet  Google Scholar 

  61. Cao X, Mo Z. A new scalable parallel method for molecular dynamics based on Cell-Block data structure. In Proc. ISPA2004, Hong Kong, Cao J, Yang L T, Lau F (eds.), Lecture Notes in Computer Science, 2004, 3358: 757–764.

  62. Bisseling R H. Parallel Scientific Computation: A Structured Approach Using BSP and MPI. Oxford University Press, 2004.

  63. Knoll D A, Keyes D E. Jacobian-free NewtonKrylov methods: A survey of approaches and applications. Journal of Computational Physics (JCP), 2004, 193: 357–397.

    Article  MATH  MathSciNet  Google Scholar 

  64. Trottenberg U, Osterlee C W, Schuller A. Multigrid. Academic Press, 2001.

  65. Mo Z, Shen L, Wittum G. Parallel adaptive multigrid algorithm for 2-D 3-T diffusion equations. Int. J. Computer Math., 2004, 81(3): 361–374.

    Article  MATH  MathSciNet  Google Scholar 

  66. Falgout R D, Jones J E, Yang U M. The Design and Implementation of Hypre, a Library of Parallel High Performance Preconditioners. Chapter in Numerical Solution of Partial Differential Equations on Parallel Computers, Bruaset A M, Bjørstad P, Tveito A (eds.), Springer-Verlag, to appear. Also available as LLNL Technical Report UCRL-JRNL-205459, 2004.

  67. Balay S, Groppy W D, McInnes L C et al. PETSc 2.0 Users Manual. Technical Report ANL-95/11, Argonne National Laboratory, Argonne, IL, Mar 2000.

  68. Bastian P, Birken K et al. UG—A flexible software toolbox for solving partial differential equations. Computation and Visualization in Science, 1997, 1: 27–40.

    Article  MATH  Google Scholar 

  69. Wissink A M, Hornung R D, Kohn S R et al. Large scale parallel structured AMR calculations using the SAMRAI framework. In Proc. High-Performance Computing and Networking Conf. (SC’2001), Denver, 2001, pp. 22–28.

  70. Lewis E E, Miller W F. Computational Methods of Neutron Transport. John Wiley & Sons Publisher, 1984.

  71. Mo Z, Fu L, Parallel flux sweep algorithm for neutron transport on unstructured grid. J. Supercomputing, 2004, 30(1): 5–17.

    Article  MATH  Google Scholar 

  72. Plimpton S, Hendrickson B, Burns S et al. Parallel algorithms for radiation transport on unstructured grids. In Proc. SuperComputing’2000, Dallas, Nov. 4–10, 2000, pp. 25–31.

  73. Mo Z, Zhang A, Cao X. Towards a parallel framework of grid-based numerical algorithms on DAGs. In Proc. 18th Int. Symp. Parallel and Distributed Computing (IPDPS’06), Greece, April 25–29, 2006, pp. 416–424.

  74. Dongarra J, Foster I, Fox G et al. (eds.). Sourcebook of Parallel Computing. Morgan Kaufmann Publishers, New York, 2003.

  75. Bernholdt D E. Parallel computational chemistry: An overview of NWChem. Chapter 7 of Sourcebook of Parallel Computing, Dongarra J, Foster I, Fox G et al. (eds.), New York: Morgan Kaufmann Publishers, 2003.

  76. Nieplocha J, Ju J, Krishnan M K et al. The global arrays user’s manual. Pacific Northwest National Laboratory Technical Report No.13130, October 1, 2002.

  77. http://www.supercomputing.org/.

  78. Jordan H F, Alaghband G, Jordan H E. Fundamentals of Parallel Computing. Prentice Hall. 2003.

  79. Chakravorty S, Kale L V. A fault tolerant protocol for massively parallel systems. In Proc. 18th International Parallel and Distributed Processing Symposium (IPDPS), Santa Fe, New Mexico, 2004, pp. 212–219.

  80. Stou Q F. Algorithms minimizing peak energy on mesh-connected systems. In Proc. 18th ACM Symp. Parallelism in Algorithms and Architectures (SPAA), Cambridge, MA, USA, 2006, pp. 331–334.

  81. Shan J, Chen Y, Diao Q et al. Parallel information extraction on shared memory multi-processor system. In Proc. Int. Conf. Parallel Processing (ICPP), Columbus, Ohio, USA, 2006, pp. 215–224.

  82. So B, Ghuloum A, Wu Y. Optimizing data parallel operations on many-core platforms. First Workshop on Software Tools for Multi-Core Systems (STMCS), Manhattan, NY, 2006, pp. 66–70.

  83. Mattson T G, Sanders B A, Massingill B L. Patterns for Parallel Programming. Prentice Hall. 2005.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guo-Liang Chen.

Additional information

Survey: Supported by the National Natural Science Foundation of China under Grant No.60533020.

Guo-Liang Chen is a professor and academician of the Chinese Academy of Sciences. He works with Dept. Computer Sci. & Tech., University of Science and Technology of China. His major research areas include parallel computing theory and algorithms.

Guang-Zhong Sun is a lecturer in the Dept. Computer Sci. & Tech., University of Science and Technology of China (USTC). His research interests include parallel algorithms and scheduling theory.

Yun-Quan Zhang is an associate professor and vice director of the Lab. of Parallel Computing, Institute of Software, CAS. His research interests include performance evaluation, parallel software design and parallel computational model.

Ze-Yao Mo is a professor. He has been doing researches on parallel algorithms and parallel application software for larger scale scientific and engineering numerical simulations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, GL., Sun, GZ., Zhang, YQ. et al. Study on Parallel Computing. J Comput Sci Technol 21, 665–673 (2006). https://doi.org/10.1007/s11390-006-0665-9

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-006-0665-9

Keywords

Navigation