Skip to main content
Log in

Finding a suitable system scale to optimize program performance on software DSM systems

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Recently, software distributed shared memory systems have successfully provided an easy user interface to parallel user applications on distributed systems. In order to prompt program performance, most of DSM systems usually were greedy to utilize all of available processors in a computer network to execute user programs. However, using more processors to execute programs cannot necessarily guarantee to obtain better program performance. The overhead of paralleling programs is increased by the addition in the number of processors used for program execution. If the performance gain from program parallel cannot compensate for the overhead, increasing the number of execution processors will result in performance degradation and resource waste. In this paper, we proposed a mechanism to dynamically find a suitable system scale to optimize performance for DSM applications according to run-time information. The experimental results show that the proposed mechanism can precisely predict the processor number that will result in the best performance and then effectively optimize the performance of the test applications by adapting system scale according to the predicted result.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. C. Amza, A. L. Cox, S. Dwarkadas, P. Keleher, H. Lu, R. Rajamony, W. Yu, and W. Zwaenepoel, Treakmarks: Shared memory computing on networks of workstations. IEEE Computer 29(2) (1996) 18–28.

    Google Scholar 

  2. J. K. Bennett, J. B. Carter, and W. Zwaenepoel, Munin: Distributed shared memory based on type-specific memory coherence, in: Proceedings of the Second ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (March 1990) 168–176.

  3. B. N. Bershad, E. D. Lazowska, and H. M. Levy, PRESTO: A System for object-oriented parallel programming. Software-Practice and Experience 18(8) (August 1988) 713–732.

    Google Scholar 

  4. M. J. Feeley, B. N. Bershad, J. S. Chase, and H. M. Levy, Dynamic node reconfiguration in a parallel-distributed environment, in: Proceedings of the Third ACM SIGPLAN Symposium on Principals and Practice of Parallel Programming(PPOPP) SIGPLAN NOTICES 26(7) (July 1991) 114–121.

  5. E. Hyeonsang and K. Jeffrey, Hollingsworth, LBF: A performance metric for program reorganization. International Conference on Distributed Computing Systems (May 1998).

  6. A. Itzkovitz, A. Schuster, and L. Wolfovich, Thread migration and its applications in distributed shared memory systems. Journal of System and Software 42 (1998) 71–87.

    Article  Google Scholar 

  7. K. Gharachorloo, D. Lenoski, J. Laudon, P. Gibbons, A. Gupta, and J. Hennessy, Memory consistency and event ordering in scaleable shared-memory multiprocessors, in: Proceedings of the 17th Annual International Symposium on Computer Architecture (1990) 15–26.

  8. P. Keleher, A. L. Cox, and W. Zwaenepoel, Lazy release consistency for software distributed shared memory, in: Proceedings of the 19th Annual Symposium on Computer Architecture (May 1992) 13–21.

  9. K. Li, Shared virtual memory on loosely coupled multi-processors, Ph.D. Thesis, Yale University (1986).

  10. T. Y. Liang, J. C. Ueng, C. K. Shieh, D. Y. Chuang, and J. Q. Lee, Distinguishing sharing types to minimize communication in software distributed shared memory systems. Journal of Systems and Software 55 (2000) 73–85.

    Article  Google Scholar 

  11. Y. Linde, A. Buzo, and R. M. Gray, An algorithm for vector quantizer design. IEEE Transaction on Communication 28 (1980) 85–95.

    Article  Google Scholar 

  12. A. J. Musciano and T. L. Sterling, Efficient dynamic scheduling of medium-grained tasks for general purpose parallel processing, in: Proceedings of the International Conference on Parallel Processing (August 1988) 166–175.

  13. T. D. Nguyen, R. Vaswani, and J. Zahorjan, Maximizing speedup through self-tuning of processor allocation, in: Proceedings of 10th International Parallel Processing Symposium (1996a) 463–468.

  14. J. K. Hollingsworth and P. J. Keleher, Prediction and adaptation in active harmony. The 7th International Symposium on High Performance Distributed Computing (1998).

  15. K. C. Sevcik, Characterizations of parallelism in applications and their use in scheduling. ACM SIGMETRICS Performance Evaluation Review 17(1) (May 1989) 171–180.

    Google Scholar 

  16. L. Iftode, J. P. Singh, and K. Li, Scope consistency: A bridge between release consistency and entry consistency, in: Proc. of the 8th Annual ACM Symposium on Parallel Algorithms and Architectures (June 1996) 277–287.

  17. K. Thitikamol and P. Keleher, Per-node multi-threading and remote latency. IEEE Transaction on Computers 47(4) (April 1998) 414–426.

    Article  Google Scholar 

  18. K. Thitikamol and P. Keleher, Active correlation tracking, in: Proceedings of the 19th International Conference on Distributed Computing Systems (1999a) 324–331.

  19. K. Thitikamol and P. Keleher, Thread migration and communication minimization in DSM systems, in: Proceedings of the IEEE 87(3) (March 1999b) 487–497.

  20. J. C. Ueng, C. K. Shieh, T. Y. Liang, and J. B. Chang, Efficient synchronization mechanisms for multithreaded distributed shared memory system, in: Proceedings of the National Computer Symposium of ROC, E158-E163 (1997).

  21. J. C. Ueng, C. K. Shieh, and T. Y. Liang, Proteus: An efficient runtime reconfigurable distributed shared memory system. Journal of System and Software 56 (2001) 247–260.

    Article  Google Scholar 

  22. J. C. Ueng, C. K. Shieh, S. C. Mac, A. C. Lai, and T. Y. Liang, Multi-threaded design for a software distributed shared memory system. IEICE Transaction on Information and System E82-D 12 (2000) 1512–1523.

  23. B. N. Bershad, M. J. Zekauskas, and W. A. Sawdon, Midway: Shared memory parallel programming with entry consistency for distributed memory multiprocessors. Annual IEEE International Computer Conference COMPCON SPRING 93 (1993) 528–537.

  24. A. C. Lai, C. K. Shieh, J. C. Ueng, Y. T. Kok, and L. Y. Kung, Load balancing in distributed shared memory system, in: Proceeding of IEEE International Performance, Computing and Communications Conference, Arizona, U.S.A. (February 1997) 152–158.

  25. A. Dubrovski, R. Friedman, and A. Schuster, Load balancing in distributed shared memory systems. International Journal of Applied Software Technology 3 (1998) 167–202.

    Google Scholar 

  26. Y. C. Zhuang, C. K. Shieh, and T. Y. Liang, The study of centralized load balance on distributed shared memory systems, in: Proceedings of the 1998 International Computer Symposium (December 1998) 137–143.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Laurence Tianruo Yang.

Additional information

Yi-Chang Zhuang received his B.S., M.S. and Ph.D. degrees in electrical engineering from National Cheng Kung University in 1995, 1997, and 2004. He is currently working as an engineer at Industrial Technology Research Institute in Taiwan. His research interests include object-based storage, file systems, distributed systems, and grid computing.

Jyh-Biau Chang is currently an assistant professor at the Information Management Department of Leader University in Taiwan. He received his B.S., M.S. and Ph.D. degrees from Electrical Engineering Department of National Cheng Kung University in 1994, 1996, and 2005. His research interest is focused on cluster and grid computing, parallel and distributed system, and operating system.

Tyng-Yeu Liang is currently an assistant professor who teaches and studies at Department of Electrical Engineering, National Kaohsiung University of Applied Sciences in Taiwan. He received his B.S., M.S. and Ph.D. degrees from National Cheng Kung University in 1992, 1994, and 2000. His study is interested in cluster and grid computing, image processing and multimedia.

Ce-Kuen Shieh currently is a professor at the Electrical Engineering Department of National Cheng Kung University in Taiwan. He is also the chief of computation center at National Cheng Kung University. He received his Ph.D. degree from the Department of Electrical Engineering of National Cheng Kung University in 1988. He was the chairman of the Electrical Engineering Department of National Cheng Kung University from 2002 to 2005. His research interest is focused on computer network, and parallel and distributed system.

Laurence T. Yang is a professor at the Department of Computer Science, St. Francis Xavier University, Canada. His research includes high performance computing and networking, embedded systems, ubiquitous/pervasive computing and intelligence, and autonomic and trusted computing.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhuang, YC., Chang, JB., Liang, TY. et al. Finding a suitable system scale to optimize program performance on software DSM systems. Cluster Comput 9, 223–236 (2006). https://doi.org/10.1007/s10586-006-9738-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-006-9738-3

Keywords

Navigation