Skip to main content

Comparing Logs and Models of Parallel Workloads Using the Co-plot Method

  • Conference paper
  • First Online:
Job Scheduling Strategies for Parallel Processing (JSSPP 1999)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1659))

Included in the following conference series:

  • 431 Accesses

Abstract

We present a multivariate analysis technique called Co-plot that is especially suitable for samples with many variables and relatively few observations, as the data about workloads often is. Observations and variables are analyzed simultaneously. We find three stable clusters of highly correlated variables, but that the workloads themselves, on the other hand, are rather different from one another. Synthetic models for workload generation are also analyzed, and found to be reasonable; however, each model usually covers well one machine type. This leads us to conclude that a parameterized model of parallel workloads should be built, and we describe guidelines for such a model. Another feature that the models lack is self-similarity: We demonstrate that production logs exhibit this phenomenon in several attributes of the workload, and in contrast that the none of the synthetic models do.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. A.K. Agrawala, J.M. Mohr and R.M. Byrant, “An approach to the workload characterization problem.” Computer 9 (6), pp.18–32, June 1976.

    Article  MATH  Google Scholar 

  2. Maria Calzarossa and Giuseppe Serazzi, “Construction and Use of Multiclass Workload Models.” Performance Evaluation 19(4), pp. 341–352, 1994.

    Article  Google Scholar 

  3. Maria Calzarossa and Giuseppe Serazzi, “Workload Characterization: A Survey.” Proc. IEEE 81 (8), pp. 1136–1150, Aug 1993.

    Google Scholar 

  4. Allen B. Downey, “A Parallel Workload Model and Its Implications for Processor Allocation.” 6th Intl. Symp. High Performance Distributed Comput., Aug 1997.

    Google Scholar 

  5. Allen B. Downey, “Using Queue Time Predictions for Processor Allocation.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (Eds.), Springer-Verlag, 1997, Lect. Notes Comput. Sci. vol. 1291, pp. 35–57.

    Google Scholar 

  6. Allen B. Downey and Dror G. Feitelson, “The Elusive Goal of Workload Characterization.” Perf. Eval. Rev. 26(4), pp. 14–29, Mar 1999.

    Article  Google Scholar 

  7. D. G. Feitelson, “Packing schemes for gang scheduling.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (Eds.), Springer-Verlag, 1996, Lect. Notes Comput. Sci. vol. 1162, pp. 89–110.

    Google Scholar 

  8. Dror G. Feitelson and Morris A. Jette, “Improved Utilization and Responsiveness with Gang Scheduling”, In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (Eds.), Springer-Verlag, 1997, Lect. Notes Comp. Sci. vol. 1291, pp. 238–261.

    Google Scholar 

  9. D. G. Feitelson and B. Nitzberg, “Job characteristics of a production parallel scientific workload on the NASA Ames iPSC/860.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (Eds.), Springer-Verlag, 1995, Lect. Notes Comput. Sci. vol. 949, pp. 337–360.

    Google Scholar 

  10. Dror G. Feitelson and Larry Rudolph, “Metrics and Benchmarking for Parallel Job Scheduling.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (Eds.), Springer-Verlag, 1998, Lect. Notes Comput. Sci. vol. 1459, pp. 1–24.

    Chapter  Google Scholar 

  11. D. Ferrai, “Workload characterization and selection in computer performance measurement.” Computer 5 (4), pp. 18–24, Jul/Aug 1972.

    Article  Google Scholar 

  12. Guttman, L., “A general non-metric technique for finding the smallest space for a configuration of points”, Psychometrica 33, pp. 479–506, 1968.

    Article  Google Scholar 

  13. Steven Hotovy, “Workload Evolution on the Cornell Theory Center IBM SP2.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (Eds.), Springer-Verlag, 1996, Lect. Notes Comput. Sci. vol. 1162, pp. 27–40.

    Chapter  Google Scholar 

  14. Joefon Jann, Pratap Pattnaik, Hubertus Franke, Fang Wang, Joseph Skovira, and Joseph Riodan, “Modeling of Workload in MPPs.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (Eds.), Springer-Verlag, 1997, Lect. Notes Comput. Sci. vol. 1291, pp. 95–116.

    Google Scholar 

  15. E.J. Koldinger, S.J. Eggers, and H.M. Levy, “On the validity of trace-driven simulation for multiprocessors.” In 18th Ann. Intl. Symp. Computer Architecture Conf. Proc., pp. 244–253, May 1991.

    Google Scholar 

  16. E.D. Lazowska, “The use of percentiles in modeling CPU service time distributions.” In Computer Performance, K.M. Chandy and M. Reiser (eds.), 53–66, North Holland, 1977.

    Google Scholar 

  17. W.E. Leland, M.S. Taqqu, W. Willinger, and D.V. Wilson, “On the self-similar nature of Ethernet traffic.” IEEE/ACM Trans. Networking 2 (1), pp. 1–15, Feb 1994.

    Article  Google Scholar 

  18. G. Lipshitz, and A. Raveh, “Applications of the Co-plot method in the study of socioeconomic differences among cities: A basis for a differential development policy”, Urban Studies 31, pp. 123–135, 1994.

    Article  Google Scholar 

  19. V. Lo, J. Mache, and K. Windisch, “A comparative study of real workload traces and synthetic workload models for parallel job scheduling.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (Eds.), Springer-Verlag, 1998. Lect. Notes Comput. Sci. vol. 1459, pp. 25–46.

    Chapter  Google Scholar 

  20. Uri Lublin, “A Workload Model for Parallel Computer Systems”, Master Thesis, Hebrew University of Jerusalem, 1999, in preparation.

    Google Scholar 

  21. S. Maital, “Multidimensional Scaling: Some Econometric Applications”, Journal of Econometrics 8, pp. 33–46, 1978.

    Article  MATH  MathSciNet  Google Scholar 

  22. S. Majumdar, D.L. Eager, and R.B. Bunt, “Scheduling in multiprogrammed parallel systems.” In Sigmetrics Conf. Measurement & Modeling of Computer Systems, pp. 104–113, May 1988.

    Google Scholar 

  23. A. Raveh, “The Greek banking system: Reanalysis of performance”, European Journal of Operational Research, (forthcoming).

    Google Scholar 

  24. K. Windisch, V. Lo, R. Moore, D. Feitelson, and B. Nitzberg, “A comparison of workload traces from two production parallel machines.” In 6th Symp. Frontiers Massively Parallel Comput., pp.319–326, Oct 1996.

    Google Scholar 

  25. M.E. Crovella and A. Bestavros, “Self-Similarity in World Wide Web Traffic: Evidence and Possible Causes.” In Sigmetrics Conf. Measurement & Modeling of Computer Systems, pp. 160–169, May 1996.

    Google Scholar 

  26. S.D. Gribble, G.S. Manku, D. Roselli, E.A. Brewer, T.J. Gibson and E.L. Miller, “Self-Similarity in File Systems.” Performace Evaluation Review 26 (1), pp. 141–150, 1998.

    Article  Google Scholar 

  27. Jan Beran, Statistics for Long-Memory Processes. Monographs on Statistics and Applied Probability. Chapman and Hall, New York, NY, 1994.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Talby, D., Feitelson, D.G., Raveh, A. (1999). Comparing Logs and Models of Parallel Workloads Using the Co-plot Method. In: Feitelson, D.G., Rudolph, L. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 1999. Lecture Notes in Computer Science, vol 1659. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47954-6_3

Download citation

  • DOI: https://doi.org/10.1007/3-540-47954-6_3

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-66676-9

  • Online ISBN: 978-3-540-47954-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics