Abstract
“Give a man a fish, feed him for a day. Teach a man to fish, feed him for a lifetime”—Lau Tzu.
Large-scale grid computing projects such as TeraGrid and Open Science Grid provide researchers vast amounts of compute resources but with requirements that could limit access, delay results due to potentially long job queues, and involve environments and policies that might affect a user’s work flow. In many scenarios and in particular with the advent of Infrastructure-as-a-Service (IaaS) cloud computing, individual users and communities can benefit from less restrictive, dynamic systems that include a combination of local resources and on-demand resources provisioned by one or more IaaS provider. These types of scenarios benefit from flexibility in deploying resources, remote access, and environment configuration.
In this paper, we address how small groups can dynamically create, join, and manage grid infrastructures with low administrative overhead. Our work distinguishes itself from other projects with similar objectives by enabling a combination of decentralized system organization and user access for job submission in addition to a web 2.0 interfaces for managing grid membership and automate certificate management. These components contribute to the design of the “Grid Appliance,” an implementation of a wide area overlay network of virtual workstations (WOW), which has developed over the past six years into a mature system with several deployments and many users. In addition to an architectural description, this paper contains lessons learned during the development and deployment of “Grid Appliance” systems and a case study backed by quantitative analysis that verifies the utility of our approach.
Similar content being viewed by others
Notes
References
Abbes, H., Cérin, C., Jemni, M.: Bonjourgrid: Orchestration of multi-instances of grid middlewares on institutional desktop grids. In: International Parallel and Distributed Processing Symposium (IPDPS) (2009)
Amazon.com, Inc: Amazon elastic compute cloud. http://aws.amazon.com/ec2 (2009)
Anderson, D.P.: Boinc: A system for public-resource computing and storage. In: The International Workshop on Grid Computing (2004)
Andrade, N., Costa, L., Germoglio, G., Cirne, W.: Peer-to-peer grid computing with the ourgrid community. In: Brazilian Symposium on Computer Networks (2005)
Andrade, N., Costa, L., Germóglio, G., Cirne, W.: Peer-to-peer grid computing with the ourgrid community. In: Brazilian Symposium on Computer Networks (SBRC)—4th Special Tools Session (2005)
Andreetto, P., Andreozzi, S., Avellino, G., Beco, S., Cavallini, A., Cecchi, M., Ciaschini, V., Dorise, A., Giacomini, F., Gianelle, A., Grandinetti, U., Guarise, A., Krop, A., Lops, R., Maraschini, A., Martelli, V., Marzolla, M., Mezzadri, M., Molinari, E., Monforte, S., Pacini, F., Pappalardo, M., Parrini, A., Patania, G., Petronzio, L., Piro, R., Porciani, M., Prelz, F., Rebatto, D., Ronchieri, E., Sgaravatto, M., Venturi, V., Zangrando, L.: The glite workload management system. Journal of Physics: Conference Series 119(6), 062007 (2008)
Boykin, P.O., Bridgewater, J.S.A., Kong, J.S., Lozev, K.M., Rezaei, B.A., Roychowdhury, V.P.: A symphony conducted by brunet. http://arxiv.org/abs/0709.4048 (2007)
DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: amazon’s highly available key-value store. In: Symposium on Operating Systems Principles (SOSP). ACM, New York (2007)
Epema, D.H.J., Livny, M., van Dantzig, R., Evers, X., Pruyne, J.: A worldwide flock of condors: Load sharing among workstation clusters. Future Generation Computer Systems 12(1), 53–65 (1996)
Figueiredo, R.J., Boykin, P.O., Fortes, J.A.B., Li, T., Peir, J., Wolinsky, D., John, L.K., Kaeli, D.R., Lilja, D.J., McKee, S.A., Memik, G., Roy, A., Tyson, G.S.: Archer: A community distributed computing infrastructure for computer architecture research and education. In: CollaborateCom (2008)
Figueiredo, R.J., Dinda, P.A., Fortes, J.A.B.: A case for grid computing on virtual machines. In: International Conference on Distributed Computing Systems. IEEE Computer Society, Washington (2003)
Foster, I.: Globus toolkit version 4: Software for service-oriented systems. Journal of Computer Science and Technology 21, 513–520 (2006). http://dx.doi.org/10.1007/s11390-006-0513-y
Ganguly, A., Agrawal, A., Boykin, O.P., Figueiredo, R.: IP over P2P: Enabling self-configuring virtual IP networks for grid computing. In: International Parallel and Distributed Processing Symposium (2006)
Ganguly, A., Agrawal, A., Boykin, P.O., Figueiredo, R.: Wow: Self-organizing wide area overlay networks of virtual workstations. In: IEEE High Performance Distributed Computing (HPDC) (2006)
Ganguly, A., Wolinsky, D., Boykin, P., Figueiredo, R.: Decentralized dynamic host configuration in wide-area overlays of virtual workstations. In: International Parallel and Distributed Processing Symposium (2007)
Ganguly, A., Boykin, P.O., Wolinsky, D., Figueiredo, R.J.: Improving peer connectivity in wide-area overlays of virtual workstations. Clust. Comput. (2009)
Harutyunyan, A., Buncic, P., Freeman, T., Keahey, K.: Dynamic virtual AliEn grid sites on nimbus with CernVM. J. Phys. Conf. Ser. (2010)
Jiang, X., Xu, D.: Violin: Virtual internetworking on overlay. In: International Symposium on Parallel and Distributed Processing and Applications, pp. 937–946 (2003)
Keahey, K., Doering, K., Foster, I.: From sandbox to playground: Dynamic virtual environments in the grid. In: International Workshop in Grid Computing (2004)
Keahey, K., Freeman, T.: Contextualization: Providing one-click virtual clusters. In: eScience (2008)
Keahey, K., Freeman, T.: Science clouds: Early experiences in cloud computing for scientific applications. In: Cloud Computing and Its Applications (2008)
Livny, M., Basney, J., Raman, R., Tannenbaum, T.: Mechanisms for high throughput computing. SPEEDUP J. 11(1) (1997)
LogMeIn: Hamachi. https://secure.logmein.com/products/hamachi2/ (2009)
Maymounkov, P., Mazières, D.: Kademlia: A peer-to-peer information system based on the XOR metric. In: International Workshop on Peer-to-Peer Systems (2002)
Nurmi, D., Wolski, R., Grzegorczyk, C., Obertelli, G., Soman, S., Youseff, L., Zagorodnov, D.: The eucalyptus open-source cloud-computing system. In: IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid) (2009)
Resources, C.: Torque resource manager. http://www.clusterresources.com/pages/products/torque-resource-manager.php (2007)
Rezmerita, A., Morlier, T., Neri, V., Cappello, F.: Private virtual cluster: Infrastructure and protocol for instant grids. In: Euro-Par (2006)
Rowstron, A., Druschel, P.: Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems. In: IFIP/ACM International Conference on Distributed Systems Platforms (Middleware) (2001)
Santhanam, S., Elango, P., Dusseau, A.A., Livny, M.: Deploying virtual machines as sandboxes for the grid. In: WORLDS (2005)
Sliepen, G.: tinc. http://www.tinc-vpn.org/ (2009)
Stoica, I., Morris, R., Liben-Nowell, D., Karger, D.R., Kaashoek, M.F., Dabek, F., Balakrishnan, H.: Chord: a scalable peer-to-peer lookup protocol for internet applications. IEEE/ACM Trans. Netw. 11(1) (2003)
Sun: gridengine. http://gridengine.sunsource.net/ (2007)
Sundararaj, A.I., Dinda, P.A.: Towards virtual networks for virtual machine grid computing. In: Conference on Virtual Machine Research And Technology Symposium, p. 14 (2004)
Tsugawa, M., Fortes, J.: A virtual network (vine) architecture for grid computing. In: International Parallel and Distributed Processing Symposium (2006)
VMware, Inc.: Timekeeping in vmware virtual machines. http://www.vmware.com/pdf/vmware_timekeeping.pdf (2008)
Wolinsky, D.I., Agrawal, A., Boykin, P.O., Davis, J., Ganguly, A., Paramygin, V., Sheng, P., Figueiredo, R.J.: On the design of virtual machine sandboxes for distributed computing in wide area overlays of virtual workstations. In: International Workshop on Virtualization Technologies in Distributed Computing (2006)
Wolinsky, D.I., Figueiredo, R.: Experiences with self-organizing decentralized grids using the grid appliance. In: International Symposium on High Performance Distributed Computing (ACM HPDC 2011) (2011)
Wolinsky, D.I., Lee, K., Boykin, P.O., Figueiredo, R.: On the design of autonomic, decentralized vpns. In: International Conference on Collaborative Computing: Networking, Applications and Worksharing (2010)
Wolinsky, D.I., Liu, Y., Juste, P.S., Venkatasubramanian, G., Figueiredo, R.: On the design of scalable, self-configuring virtual networks. In: IEEE/ACM Supercomputing 2009 (2009)
Wolinsky, D.I., St. Juste, P., Boykin, P.O., Figueiredo, R.: Addressing the P2P bootstrap problem for small overlay networks. In: 10th IEEE International Conference on Peer-to-Peer Computing (P2P) (2010)
Wright, C.P., Zadok, E.: Unionfs: Bringing file systems together. Linux J. (2004)
Yonan, J.: OpenVPN. http://openvpn.net/ (2009)
Acknowledgements
Our appreciation goes out to the Archer administrators at the various universities who have taken the time to understand the system and provide meaningful input in order to simplify it even more, in particular, Perhaad Mistry at Northeastern University, Daniel Debertin at University of Minnesota at Minneapolis, Dimitris Kaseridis formely at University of Texas at Austin, Faisal Iqbal at University of Texas at Austin, Peter Gavin at Florida State University, and Pred Bundalo at Northwestern University. Also many thanks is due to Girish Venkatasubramanian for his significant contributions to the Archer project. This work is sponsored by the National Science Foundation (NSF) under awards 0751112 and 0721867. This material is based upon work supported in part by the NSF under grant 091812 (Future Grid). Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the NSF.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wolinsky, D.I., Chuchaisri, P., Lee, K. et al. Experiences with self-organizing, decentralized grids using the grid appliance. Cluster Comput 16, 265–283 (2013). https://doi.org/10.1007/s10586-011-0195-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-011-0195-2