Scalability analysis of three monitoring and information systems: MDS2, R-GMA, and Hawkeye

https://doi.org/10.1016/j.jpdc.2007.03.006Get rights and content

Abstract

Monitoring and information system (MIS) implementations provide data about available resources and services within a distributed system, or Grid. A comprehensive performance evaluation of an MIS can aid in detecting potential bottlenecks, advise in deployment, and help improve future system development. In this paper, we analyze and compare the performance of three implementations in a quantitative manner: the Globus Toolkit® Monitoring and Discovery Service (MDS2), the European DataGrid Relational Grid Monitoring Architecture (R-GMA), and the Condor project's Hawkeye. We use the NetLogger toolkit to instrument the main service components of each MIS and conduct four sets of experiments to benchmark their scalability with respect to the number of users, the number of resources, and the amount of data collected. Our study provides quantitative measurements comparable across all systems. We also find performance bottlenecks and identify how they relate to the design goals, underlying architectures, and implementation technologies of the corresponding MIS, and we present guidelines for deploying MISs in practice.

Section snippets

Xuehai Zhang received his B.S. degree in Computer Science from University of Science and Technology of China, Hefei in 1992. He received his M.S. degree in Computer Science from University of Chicago in 2002. He is currently a Ph.D. candidate in the Department of Computer Science at University of Chicago. His research interests are in parallel and Grid computing, with emphasis on Grid systems’ performance analysis and resource management on the virtual workspace, a virtualized Grid execution

References (61)

  • S. Brin et al.

    The anatomy of a large-scale hypertextual web search engine

    Comput. Networks

    (1998)
  • E.P. Markatos

    On caching search engine results

    Comput. Communications

    (2001)
  • S. Zanikolas et al.

    A taxonomy of grid monitoring systems

    Future Gener. Comput. Syst.

    (2005)
  • G. Aloisio, M. Cafaro, I. Epicoco, S. Fiore, Analysis of the globus toolkit grid information service, GridLab,...
  • AltaVista,...
  • G. Banga, P. Druschel, Measuring the capacity of a web server, Proceedings of USENIX Symposium on Internet Technologies...
  • L. Barroso, J. Dean, U. Holzle, Web search for a planet: the Google cluster architecture, in: IEEE Micro, vol. 23,...
  • V. Cardellini, M. Colajanni, P.S. Yu, Dynamic load balancing on web-server systems, in: IEEE Internet Computing 3(3)...
  • E. Cohen, H. Kaplan, Proactive caching of DNS records: addressing a performance bottleneck, Proceedings of 2001...
  • A. Cooke, A.Gray, L. Ma, W. Nutt, J. Magowan, P. Taylor, R. Byrom, L. Field, S. Hicks, J. Leake, R-GMA: an information...
  • A. Cooke et al.

    The relational grid monitoring architecture: mediating information about the grid

    J. Grid Comput.

    (2004)
  • K. Czajkowski, S. Fitzgerald, I. Foster, C. Kesselman, Grid information services for distributed resource sharing,...
  • DataGrid WP3 Information and Monitoring Services,...
  • European DataGrid Project,...
  • Excite,...
  • S. Fisher, Relational model for information and monitoring, GGF, Technical Report GWD-Perf-7-1,...
  • I. Foster, C. Kesselman, G. Tsudik, S. Tuecke, Security architecture for computational grids, Proceedings of the Fifth...
  • Full Disclosure Report for HP ProLiant DL580-PDC 32P, third ed., Hewlett-Packard Company,...
  • Ganglia,...
  • M. Gerndt, R. Wismueller, Z. Balaton, G. Gombas, P. Kacsuk, Z. Nemeth, N. Podhorszki, H.-L. Truong, T. Fahringer, M....
  • Global Grid Forum,...
  • Globus Alliance,...
  • D. Gunter, B. Tierney, NetLogger: a toolkit for distributed system performance tuning and debugging, Proceedings of...
  • A. Habib, M. Abrams, Analysis of sources of latency in downloading web pages, Proceedings of Webnet,...
  • Hawkeye,...
  • Inca,...
  • Iperf,...
  • A. Iyengar, E. MacNair, T. Nguyen, An analysis of web server performance, Proceedings of IEEE Global Internet 1997,...
  • Java Servlet Technology,...
  • J. Jung, E. Sit, H. Balakrishnan, R. Morris, DNS performance and the effectiveness of caching, Proceedings of ACM...
  • Cited by (21)

    • Resource discovery for distributed computing systems: A comprehensive survey

      2018, Journal of Parallel and Distributed Computing
      Citation Excerpt :

      However, these approaches suffer from sub-optimal scalability and lower fault tolerance, mostly due to the centralized nature of the directories, as discussed previously in Section 3.1. Nimrod-G [4] and Condor-G [119] are the examples of Grid super-schedulers where they have employed a centralized Grid information services such as R-GMA [52,53,343], Hawkeye [234,343] and Grid Market Directory (GMD) [334] to index their resource information. (B) Hierarchical-Grid: Another approach for discovery in Grids relies on hierarchically organized servers.

    • Resource discovery mechanisms in grid systems: A survey

      2014, Journal of Network and Computer Applications
      Citation Excerpt :

      As long as a provider exists in a directory, it is included in results for relevant discovery queries. MDS-2 uses the LDAP as a uniform means of storing system information from a rich variety of system components, for constructing a uniform namespace for resource information across a system that may consist of many organizations, and for query processing (Zhang et al., 2007). MDS-2 also supports secure data access through the use of Grid Security Infrastructure (GSI) credentials.

    • Performance analysis of grid architecture via queueing theory

      2014, International Journal of Foundations of Computer Science
    View all citing articles on Scopus

    Xuehai Zhang received his B.S. degree in Computer Science from University of Science and Technology of China, Hefei in 1992. He received his M.S. degree in Computer Science from University of Chicago in 2002. He is currently a Ph.D. candidate in the Department of Computer Science at University of Chicago. His research interests are in parallel and Grid computing, with emphasis on Grid systems’ performance analysis and resource management on the virtual workspace, a virtualized Grid execution environment based on virtual machine technologies.

    Jeffrey L. Freschl received his B.S. degree in Computer Science from the University of California, Santa Cruz in 2003. He received his M.S. degree in Computer Science from the University of Wisconsin, Madison in 2005. He is currently a Software Engineer with the DB2 Optimization group at IBM, Silicon Valley Lab. His research interests are in parallel and grid computing, with an emphasis in algorithm design and performance evaluation.

    Jennifer M. Schopf is a Scientist at the Distributed Systems Lab, part of the Mathematics and Computer Science Division at Argonne National Lab, and is spending the year as a researcher at the National Science Center in Edinburgh, UK. She is a member of the Globus Alliance, and technology coordinator for the MDS. She received a B.A. in Computer Science and Mathematics from Vassar College, and M.S. and Ph.D. degrees from the University of California, San Diego in Computer Science and Engineering. Currently, her research interests include monitoring, performance prediction, and resource scheduling and selection.

    View full text