Skip to main content
Log in

Experiences with self-organizing, decentralized grids using the grid appliance

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

“Give a man a fish, feed him for a day. Teach a man to fish, feed him for a lifetime”—Lau Tzu.

Large-scale grid computing projects such as TeraGrid and Open Science Grid provide researchers vast amounts of compute resources but with requirements that could limit access, delay results due to potentially long job queues, and involve environments and policies that might affect a user’s work flow. In many scenarios and in particular with the advent of Infrastructure-as-a-Service (IaaS) cloud computing, individual users and communities can benefit from less restrictive, dynamic systems that include a combination of local resources and on-demand resources provisioned by one or more IaaS provider. These types of scenarios benefit from flexibility in deploying resources, remote access, and environment configuration.

In this paper, we address how small groups can dynamically create, join, and manage grid infrastructures with low administrative overhead. Our work distinguishes itself from other projects with similar objectives by enabling a combination of decentralized system organization and user access for job submission in addition to a web 2.0 interfaces for managing grid membership and automate certificate management. These components contribute to the design of the “Grid Appliance,” an implementation of a wide area overlay network of virtual workstations (WOW), which has developed over the past six years into a mature system with several deployments and many users. In addition to an architectural description, this paper contains lessons learned during the development and deployment of “Grid Appliance” systems and a case study backed by quantitative analysis that verifies the utility of our approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. http://money.cnn.com/2011/04/21/technology/amazon_server_outage/index.htm.

  2. http://www.archer-project.org.

  3. http://www.grid-appliance.org/wiki/index.php/EEL6892_Spring_11_HW_2:_Understanding_Simics.

  4. http://www.mcs.anl.gov/research/projects/mpich2/.

  5. http://hadoop.apache.org.

  6. http://globus.org/provision.

References

  1. Abbes, H., Cérin, C., Jemni, M.: Bonjourgrid: Orchestration of multi-instances of grid middlewares on institutional desktop grids. In: International Parallel and Distributed Processing Symposium (IPDPS) (2009)

    Google Scholar 

  2. Amazon.com, Inc: Amazon elastic compute cloud. http://aws.amazon.com/ec2 (2009)

  3. Anderson, D.P.: Boinc: A system for public-resource computing and storage. In: The International Workshop on Grid Computing (2004)

    Google Scholar 

  4. Andrade, N., Costa, L., Germoglio, G., Cirne, W.: Peer-to-peer grid computing with the ourgrid community. In: Brazilian Symposium on Computer Networks (2005)

    Google Scholar 

  5. Andrade, N., Costa, L., Germóglio, G., Cirne, W.: Peer-to-peer grid computing with the ourgrid community. In: Brazilian Symposium on Computer Networks (SBRC)—4th Special Tools Session (2005)

    Google Scholar 

  6. Andreetto, P., Andreozzi, S., Avellino, G., Beco, S., Cavallini, A., Cecchi, M., Ciaschini, V., Dorise, A., Giacomini, F., Gianelle, A., Grandinetti, U., Guarise, A., Krop, A., Lops, R., Maraschini, A., Martelli, V., Marzolla, M., Mezzadri, M., Molinari, E., Monforte, S., Pacini, F., Pappalardo, M., Parrini, A., Patania, G., Petronzio, L., Piro, R., Porciani, M., Prelz, F., Rebatto, D., Ronchieri, E., Sgaravatto, M., Venturi, V., Zangrando, L.: The glite workload management system. Journal of Physics: Conference Series 119(6), 062007 (2008)

    Article  Google Scholar 

  7. Boykin, P.O., Bridgewater, J.S.A., Kong, J.S., Lozev, K.M., Rezaei, B.A., Roychowdhury, V.P.: A symphony conducted by brunet. http://arxiv.org/abs/0709.4048 (2007)

  8. DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: amazon’s highly available key-value store. In: Symposium on Operating Systems Principles (SOSP). ACM, New York (2007)

    Google Scholar 

  9. Epema, D.H.J., Livny, M., van Dantzig, R., Evers, X., Pruyne, J.: A worldwide flock of condors: Load sharing among workstation clusters. Future Generation Computer Systems 12(1), 53–65 (1996)

    Article  Google Scholar 

  10. Figueiredo, R.J., Boykin, P.O., Fortes, J.A.B., Li, T., Peir, J., Wolinsky, D., John, L.K., Kaeli, D.R., Lilja, D.J., McKee, S.A., Memik, G., Roy, A., Tyson, G.S.: Archer: A community distributed computing infrastructure for computer architecture research and education. In: CollaborateCom (2008)

    Google Scholar 

  11. Figueiredo, R.J., Dinda, P.A., Fortes, J.A.B.: A case for grid computing on virtual machines. In: International Conference on Distributed Computing Systems. IEEE Computer Society, Washington (2003)

    Google Scholar 

  12. Foster, I.: Globus toolkit version 4: Software for service-oriented systems. Journal of Computer Science and Technology 21, 513–520 (2006). http://dx.doi.org/10.1007/s11390-006-0513-y

    Article  Google Scholar 

  13. Ganguly, A., Agrawal, A., Boykin, O.P., Figueiredo, R.: IP over P2P: Enabling self-configuring virtual IP networks for grid computing. In: International Parallel and Distributed Processing Symposium (2006)

    Google Scholar 

  14. Ganguly, A., Agrawal, A., Boykin, P.O., Figueiredo, R.: Wow: Self-organizing wide area overlay networks of virtual workstations. In: IEEE High Performance Distributed Computing (HPDC) (2006)

    Google Scholar 

  15. Ganguly, A., Wolinsky, D., Boykin, P., Figueiredo, R.: Decentralized dynamic host configuration in wide-area overlays of virtual workstations. In: International Parallel and Distributed Processing Symposium (2007)

    Google Scholar 

  16. Ganguly, A., Boykin, P.O., Wolinsky, D., Figueiredo, R.J.: Improving peer connectivity in wide-area overlays of virtual workstations. Clust. Comput. (2009)

  17. Harutyunyan, A., Buncic, P., Freeman, T., Keahey, K.: Dynamic virtual AliEn grid sites on nimbus with CernVM. J. Phys. Conf. Ser. (2010)

  18. Jiang, X., Xu, D.: Violin: Virtual internetworking on overlay. In: International Symposium on Parallel and Distributed Processing and Applications, pp. 937–946 (2003)

    Google Scholar 

  19. Keahey, K., Doering, K., Foster, I.: From sandbox to playground: Dynamic virtual environments in the grid. In: International Workshop in Grid Computing (2004)

    Google Scholar 

  20. Keahey, K., Freeman, T.: Contextualization: Providing one-click virtual clusters. In: eScience (2008)

    Google Scholar 

  21. Keahey, K., Freeman, T.: Science clouds: Early experiences in cloud computing for scientific applications. In: Cloud Computing and Its Applications (2008)

    Google Scholar 

  22. Livny, M., Basney, J., Raman, R., Tannenbaum, T.: Mechanisms for high throughput computing. SPEEDUP J. 11(1) (1997)

  23. LogMeIn: Hamachi. https://secure.logmein.com/products/hamachi2/ (2009)

  24. Maymounkov, P., Mazières, D.: Kademlia: A peer-to-peer information system based on the XOR metric. In: International Workshop on Peer-to-Peer Systems (2002)

    Google Scholar 

  25. Nurmi, D., Wolski, R., Grzegorczyk, C., Obertelli, G., Soman, S., Youseff, L., Zagorodnov, D.: The eucalyptus open-source cloud-computing system. In: IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid) (2009)

    Google Scholar 

  26. Resources, C.: Torque resource manager. http://www.clusterresources.com/pages/products/torque-resource-manager.php (2007)

  27. Rezmerita, A., Morlier, T., Neri, V., Cappello, F.: Private virtual cluster: Infrastructure and protocol for instant grids. In: Euro-Par (2006)

    Google Scholar 

  28. Rowstron, A., Druschel, P.: Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems. In: IFIP/ACM International Conference on Distributed Systems Platforms (Middleware) (2001)

    Google Scholar 

  29. Santhanam, S., Elango, P., Dusseau, A.A., Livny, M.: Deploying virtual machines as sandboxes for the grid. In: WORLDS (2005)

    Google Scholar 

  30. Sliepen, G.: tinc. http://www.tinc-vpn.org/ (2009)

  31. Stoica, I., Morris, R., Liben-Nowell, D., Karger, D.R., Kaashoek, M.F., Dabek, F., Balakrishnan, H.: Chord: a scalable peer-to-peer lookup protocol for internet applications. IEEE/ACM Trans. Netw. 11(1) (2003)

  32. Sun: gridengine. http://gridengine.sunsource.net/ (2007)

  33. Sundararaj, A.I., Dinda, P.A.: Towards virtual networks for virtual machine grid computing. In: Conference on Virtual Machine Research And Technology Symposium, p. 14 (2004)

    Google Scholar 

  34. Tsugawa, M., Fortes, J.: A virtual network (vine) architecture for grid computing. In: International Parallel and Distributed Processing Symposium (2006)

    Google Scholar 

  35. VMware, Inc.: Timekeeping in vmware virtual machines. http://www.vmware.com/pdf/vmware_timekeeping.pdf (2008)

  36. Wolinsky, D.I., Agrawal, A., Boykin, P.O., Davis, J., Ganguly, A., Paramygin, V., Sheng, P., Figueiredo, R.J.: On the design of virtual machine sandboxes for distributed computing in wide area overlays of virtual workstations. In: International Workshop on Virtualization Technologies in Distributed Computing (2006)

    Google Scholar 

  37. Wolinsky, D.I., Figueiredo, R.: Experiences with self-organizing decentralized grids using the grid appliance. In: International Symposium on High Performance Distributed Computing (ACM HPDC 2011) (2011)

    Google Scholar 

  38. Wolinsky, D.I., Lee, K., Boykin, P.O., Figueiredo, R.: On the design of autonomic, decentralized vpns. In: International Conference on Collaborative Computing: Networking, Applications and Worksharing (2010)

    Google Scholar 

  39. Wolinsky, D.I., Liu, Y., Juste, P.S., Venkatasubramanian, G., Figueiredo, R.: On the design of scalable, self-configuring virtual networks. In: IEEE/ACM Supercomputing 2009 (2009)

    Google Scholar 

  40. Wolinsky, D.I., St. Juste, P., Boykin, P.O., Figueiredo, R.: Addressing the P2P bootstrap problem for small overlay networks. In: 10th IEEE International Conference on Peer-to-Peer Computing (P2P) (2010)

    Google Scholar 

  41. Wright, C.P., Zadok, E.: Unionfs: Bringing file systems together. Linux J. (2004)

    Google Scholar 

  42. Yonan, J.: OpenVPN. http://openvpn.net/ (2009)

Download references

Acknowledgements

Our appreciation goes out to the Archer administrators at the various universities who have taken the time to understand the system and provide meaningful input in order to simplify it even more, in particular, Perhaad Mistry at Northeastern University, Daniel Debertin at University of Minnesota at Minneapolis, Dimitris Kaseridis formely at University of Texas at Austin, Faisal Iqbal at University of Texas at Austin, Peter Gavin at Florida State University, and Pred Bundalo at Northwestern University. Also many thanks is due to Girish Venkatasubramanian for his significant contributions to the Archer project. This work is sponsored by the National Science Foundation (NSF) under awards 0751112 and 0721867. This material is based upon work supported in part by the NSF under grant 091812 (Future Grid). Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David Isaac Wolinsky.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wolinsky, D.I., Chuchaisri, P., Lee, K. et al. Experiences with self-organizing, decentralized grids using the grid appliance. Cluster Comput 16, 265–283 (2013). https://doi.org/10.1007/s10586-011-0195-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-011-0195-2

Keywords

Navigation