Skip to main content
Log in

An Analysis of the Cost Effectiveness of an Adaptable Computing Cluster

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

With a focus on commodity PC systems, Beowulf clusters traditionally lack the cutting edge network architectures, memory subsystems, and processor technologies found in their more expensive supercomputer counterparts. What Beowulf clusters lack in technology, they more than make up for with their significant cost advantage over traditional supercomputers. This paper presents the cost implications of an architectural extension that adds reconfigurable computing to the network interface of Beowulf clusters. This extension is called an intelligent network interface card (INIC). A quantitative description of cost-effectiveness is formulated to compare alternatives.

Cost-effectiveness is considered in the context of three applications: the 2D Fast Fourier Transform (2D-FFT), integer sorting, and PNN image classification. It is shown that, for these three representative applications, there is a range of basic hardware costs and cluster sizes for which the INIC is more efficient than a purely serial solution or an ordinary cluster. Furthermore, the cost model has proven useful for designing the next generation INIC.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. W.B. Ligon, S.P. McMillan, G. Monn, F. Stivers, K. Schoonover and K.D. Underwood, A re-evaluation of the praticality of floating-point on FPGAs, in: Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines, Napa Valley, CA (April 1998).

  2. G. Lienhart, A. Kugel and R. Manner, Using floating-point arithmetic on FPGAs to accelerate scientific N-body simulations, in: Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines (April 2002).

  3. K.D. Underwood, W.B. Ligon and R.R. Sass, Analysis of a prototype intelligent network interface, Concurrency and Computation: Practice and Experience 15(7-8) (2003) 751–777.

    Google Scholar 

  4. K.D. Underwood, An evaluation of the integration of reconfigurable hardware with the network interface in cluster computer systems, Ph.D. thesis, Clemson University (August 2002).

  5. K.D. Underwood, W.B. Ligon and R.R. Sass, Extension of the Beowulf cluster system architecture with an intelligent network interface (2004), submitted.

  6. V. Kumar, A. Grama, A. Gupta and G. Karypis, Introduction to Parallel Computing: Design and Analysis of Algorithms (Benjamin/Cummings, Redwood City, CA, 1994).

    Google Scholar 

  7. D. Sarkar, Cost and time-cost effectiveness of multiprocessing, IEEE Transactions on Parallel and Distributed Systems 4(6) (1993) 704–712.

    Google Scholar 

  8. B. Falsafi and D.A. Wood, When does dedicated protocol processing make sense?, Tech. report CS-TR-1996-1302, Computer Sciences De-partment, University of Wisconsin-Madison (1996).

  9. D.A. Wood and M.D. Hill, Cost-effective parallel computing, IEEE Computer 28(2) (1995) 69–72.

    Google Scholar 

  10. P. Bellows, V. Bhaskaran, J. Flidr, T. Lehman, B. Schott and K.D. Un-derwood, GRIP: A reconfigurable architecture for host-based gigabit-rate packet processing, in: Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines (April 2002).

  11. R.G. Jaganathan, K.D. Underwood and R. Sass, A configurable net-work protocol for cluster based communications using modular hard-ware primitives on an intelligent NIC, in: Proceedings of the 2003 Con-ference on Supercomputing (November 2003).

  12. M. Frigo and S.G. Johnson, FFTW: An adaptive software architec-ture for the FFT, in: Proceedings of International Conference on Acoustics, Speech and Signal Processing, Vol. 3, Seattle, WA (May 1998) pp. 1381–1384.

    Google Scholar 

  13. D. Bailey, T. Harris, W. Saphir, R. van der Wijngaart, A. Woo and M. Yarrow, The NAS parallel benchmarks 2.0, Technical report NAS-95-020, NASA (December 1995).

  14. R.C. Argarwal, A super scalar sort algorithm for RISC processors, in: Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data (June 1996) pp. 240–246.

  15. D.R. Spect, Probabilistic neural networks, Neural Networks 3 (1990) 109–118.

    Google Scholar 

  16. E. Parzen, On the estimation of a probability density function and mode, IEEE Transactions Electronic Computers EC-16 (1967) 309-319.

  17. S.R. Chettri and R.F. Cromp, Probabilistic neural network architecture for high speed classification of remotely sensed imagery, Telematics and Informatics 10(3) (1993) 187–198.

    Google Scholar 

  18. A. Dandalis, V.K. Prasanna and J.D.P. Rolim, An adaptive crypto-graphic engine for IPSec architectures, in: Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines, Napa Valley, CA (April 2000) pp. 132–141.

  19. SRC Computers, Inc., MAP processor (August 2002), from webpage at http://www.srccomp.com/.

  20. N.J. Boden, D. Cohen, R.E. Felderman A.E. Kulawik, C.L. Seitz, J.N. Seizovic and W.-K. Su, Myrinet: A gigabit-per-second local area network, IEEE Micro 15(1) (February 1995) 29–36.

    Google Scholar 

  21. D.E. Culler, A. Arpaci-Dusseau, R. Arpaci-Dusseau, B. Chun, S. Lu-metta, A. Mainwaring, R. Martin, C. Yoshikawa and F. Wong, Parallel computing on the Berkeley NOW, in: 9th Joint Symposium on Parallel Processing, Kobe, Japan (1997).

  22. Compaq, Compaq Servernet II SAN interconnect for scalable com-puting clusters (June 2000), from Whitepaper found at http://www. compaq.com/support/techpubs/whitepapers/tc000602wp.html.

  23. B. Falsafi and D.A. Wood, Scheduling communication on an SMP node parallel machine, in: Proceedings of 3rd International Symposium on High Performance Computer Architecture, San Antonio, TX (February 1997).

  24. S.K. Reinhardt, J.R. Larus and D.A. Wood, Tempest and Typhoon: User-level shared memory, in: International Conference on Computer Architecture, Chicago, IL, USA (April 1994) pp. 260–267.

  25. M.-C. Roşu, K. Schwan and R. Fujimoto, Supporting parallel applications on clusters of workstations: The intelligent network interface approach, in: Proceeding of the 6th International Symposium on High Performance Distributed Computing (HPDC 97) (1997).

  26. S. Sumimoto, H. Tezuka, A. Hori, H. Harada, T. Takahashi and Y. Ishikawa, The design and evaluation of high performance commu-nication using a Gigabit Ethernet, in: International Conference on Su-percomputing, Rhodes, Greece (June 1999) pp. 260–267.

  27. Y. Coady, J. Suan Ong and M.J. Feeley, Using embedded network processors to implement global memory management in a workstation cluster, in: Proceedings of the 8th IEEE International Symposium on High Performance Distributed Computing, Redondo Beach, CA (Au-gust 1999).

  28. T. Mummert, C. Kosak, P. Steenkiste and A. Fisher, Fine grain parallel communication on general purpose LANs, in: Proceedings of 1996 In-ternational Conference on Supercomputing (ICS96), Philadelphia, PA (May 1996) pp. 341–349.

  29. M. Jones, L. Scharf, J. Scott, C. Twaddle, M. Yaconis, K. Yao and P. Athanas, Implementing an API for distributed adaptive comput-ing systems, in: Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines, Napa Valley, CA (April 1999}) pp. 222–23

  30. R. Franklin, D. Carver and B.L. Hutchings, Assisting network in-trusion detection with reconfigurable hardware, in: Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Ma-chines (April 2002).

  31. D.C. Hoffmeister and P.W. Dowd, An FPGA-based network interface for WDM gigabit networks, in: Proceedings of the 1998 Military and Aerospace Applications of Programmable Devices and Technologies Conference (MAPLD) (September 1998).

  32. A. Dollas, D. Pnevmatikatos, N. Aslanides, S. Kavvadias, E. Sotiri-ades, S. Zogopoulos, K. Papademetriou, N. Chrysos and K. Harteros, Architecture and applications of PLATO, a reconfigurable active net-work platform, in: Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines (April 2001).

  33. J.R. Hess, D.C. Lee, S.J. Harper, M.T. Jones and P.M. Athanas, Im-plementation and evaluation of a prototype reconfigurable router, in: Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines, Napa Valley, CA (April 1999) pp. 44–50.

  34. J.W. Lockwood, J.S. Turner and D.E. Taylor, Field programmable port extender (FPX) for distributed routing and queueing, in: Proceedings of the ACM International Symposium on Field Programmable Gate Ar-rays, Napa Valley, CA (April 2000) pp. 137–144.

  35. J.W. Lockwood, N. Naufel, J.S. Turner and D.E. Taylor, Reprogram-mable network packet processing on the field programmable port ex-tender (fpx), in: Proceedings of the ACM International Symposium on Field Programmable Gate Arrays, Napa Valley, CA (April 2001) pp. 87–93.

  36. J.T. McHenry, P.W. Dowd, F.A. Pellegrino, T.M. Carrozzi and W.B. Cocks, An FPGA-based coprocessor for ATM firewalls, in: Pro-ceedings of the IEEE Symposium on FPGAs for Custom Computing Machines, Napa Valley, CA (April 1997) pp. 30–39.

  37. C.-C. Yeh, C.-H. Wu and J.-Y. Juang, Design and implementation of a multicomputer interconnection network using FPGAs, in: Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines, Napa Valley, CA (April 1995) pp. 30–39.

  38. T.S. Sproull, J.W. Lockwood and D.E. Taylor, Control and configura-tion software for a reconfigurable networking platform, in: Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines (April 2002).

  39. D.E. Taylor, J.S. Turner, J.W. Lockwood, T.S. Sproull and D.B. Parlour, Scalable IP lookup for internet routers, IEEE Journal on Selected Areas in Communications (JSAC) 21(4) (May 2003) 522–534.

    Google Scholar 

Download references

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Underwood, K.D., Ligon III, W.B. & Sass, R.R. An Analysis of the Cost Effectiveness of an Adaptable Computing Cluster. Cluster Computing 7, 357–371 (2004). https://doi.org/10.1023/B:CLUS.0000039495.40522.de

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/B:CLUS.0000039495.40522.de

Navigation