Abstract
We present a multivariate analysis technique called Co-plot that is especially suitable for samples with many variables and relatively few observations, as the data about workloads often is. Observations and variables are analyzed simultaneously. We find three stable clusters of highly correlated variables, but that the workloads themselves, on the other hand, are rather different from one another. Synthetic models for workload generation are also analyzed, and found to be reasonable; however, each model usually covers well one machine type. This leads us to conclude that a parameterized model of parallel workloads should be built, and we describe guidelines for such a model. Another feature that the models lack is self-similarity: We demonstrate that production logs exhibit this phenomenon in several attributes of the workload, and in contrast that the none of the synthetic models do.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
A.K. Agrawala, J.M. Mohr and R.M. Byrant, “An approach to the workload characterization problem.” Computer 9 (6), pp.18–32, June 1976.
Maria Calzarossa and Giuseppe Serazzi, “Construction and Use of Multiclass Workload Models.” Performance Evaluation 19(4), pp. 341–352, 1994.
Maria Calzarossa and Giuseppe Serazzi, “Workload Characterization: A Survey.” Proc. IEEE 81 (8), pp. 1136–1150, Aug 1993.
Allen B. Downey, “A Parallel Workload Model and Its Implications for Processor Allocation.” 6th Intl. Symp. High Performance Distributed Comput., Aug 1997.
Allen B. Downey, “Using Queue Time Predictions for Processor Allocation.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (Eds.), Springer-Verlag, 1997, Lect. Notes Comput. Sci. vol. 1291, pp. 35–57.
Allen B. Downey and Dror G. Feitelson, “The Elusive Goal of Workload Characterization.” Perf. Eval. Rev. 26(4), pp. 14–29, Mar 1999.
D. G. Feitelson, “Packing schemes for gang scheduling.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (Eds.), Springer-Verlag, 1996, Lect. Notes Comput. Sci. vol. 1162, pp. 89–110.
Dror G. Feitelson and Morris A. Jette, “Improved Utilization and Responsiveness with Gang Scheduling”, In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (Eds.), Springer-Verlag, 1997, Lect. Notes Comp. Sci. vol. 1291, pp. 238–261.
D. G. Feitelson and B. Nitzberg, “Job characteristics of a production parallel scientific workload on the NASA Ames iPSC/860.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (Eds.), Springer-Verlag, 1995, Lect. Notes Comput. Sci. vol. 949, pp. 337–360.
Dror G. Feitelson and Larry Rudolph, “Metrics and Benchmarking for Parallel Job Scheduling.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (Eds.), Springer-Verlag, 1998, Lect. Notes Comput. Sci. vol. 1459, pp. 1–24.
D. Ferrai, “Workload characterization and selection in computer performance measurement.” Computer 5 (4), pp. 18–24, Jul/Aug 1972.
Guttman, L., “A general non-metric technique for finding the smallest space for a configuration of points”, Psychometrica 33, pp. 479–506, 1968.
Steven Hotovy, “Workload Evolution on the Cornell Theory Center IBM SP2.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (Eds.), Springer-Verlag, 1996, Lect. Notes Comput. Sci. vol. 1162, pp. 27–40.
Joefon Jann, Pratap Pattnaik, Hubertus Franke, Fang Wang, Joseph Skovira, and Joseph Riodan, “Modeling of Workload in MPPs.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (Eds.), Springer-Verlag, 1997, Lect. Notes Comput. Sci. vol. 1291, pp. 95–116.
E.J. Koldinger, S.J. Eggers, and H.M. Levy, “On the validity of trace-driven simulation for multiprocessors.” In 18th Ann. Intl. Symp. Computer Architecture Conf. Proc., pp. 244–253, May 1991.
E.D. Lazowska, “The use of percentiles in modeling CPU service time distributions.” In Computer Performance, K.M. Chandy and M. Reiser (eds.), 53–66, North Holland, 1977.
W.E. Leland, M.S. Taqqu, W. Willinger, and D.V. Wilson, “On the self-similar nature of Ethernet traffic.” IEEE/ACM Trans. Networking 2 (1), pp. 1–15, Feb 1994.
G. Lipshitz, and A. Raveh, “Applications of the Co-plot method in the study of socioeconomic differences among cities: A basis for a differential development policy”, Urban Studies 31, pp. 123–135, 1994.
V. Lo, J. Mache, and K. Windisch, “A comparative study of real workload traces and synthetic workload models for parallel job scheduling.” In Job Scheduling Strategies for Parallel Processing, D. G. Feitelson and L. Rudolph (Eds.), Springer-Verlag, 1998. Lect. Notes Comput. Sci. vol. 1459, pp. 25–46.
Uri Lublin, “A Workload Model for Parallel Computer Systems”, Master Thesis, Hebrew University of Jerusalem, 1999, in preparation.
S. Maital, “Multidimensional Scaling: Some Econometric Applications”, Journal of Econometrics 8, pp. 33–46, 1978.
S. Majumdar, D.L. Eager, and R.B. Bunt, “Scheduling in multiprogrammed parallel systems.” In Sigmetrics Conf. Measurement & Modeling of Computer Systems, pp. 104–113, May 1988.
A. Raveh, “The Greek banking system: Reanalysis of performance”, European Journal of Operational Research, (forthcoming).
K. Windisch, V. Lo, R. Moore, D. Feitelson, and B. Nitzberg, “A comparison of workload traces from two production parallel machines.” In 6th Symp. Frontiers Massively Parallel Comput., pp.319–326, Oct 1996.
M.E. Crovella and A. Bestavros, “Self-Similarity in World Wide Web Traffic: Evidence and Possible Causes.” In Sigmetrics Conf. Measurement & Modeling of Computer Systems, pp. 160–169, May 1996.
S.D. Gribble, G.S. Manku, D. Roselli, E.A. Brewer, T.J. Gibson and E.L. Miller, “Self-Similarity in File Systems.” Performace Evaluation Review 26 (1), pp. 141–150, 1998.
Jan Beran, Statistics for Long-Memory Processes. Monographs on Statistics and Applied Probability. Chapman and Hall, New York, NY, 1994.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Talby, D., Feitelson, D.G., Raveh, A. (1999). Comparing Logs and Models of Parallel Workloads Using the Co-plot Method. In: Feitelson, D.G., Rudolph, L. (eds) Job Scheduling Strategies for Parallel Processing. JSSPP 1999. Lecture Notes in Computer Science, vol 1659. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47954-6_3
Download citation
DOI: https://doi.org/10.1007/3-540-47954-6_3
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66676-9
Online ISBN: 978-3-540-47954-3
eBook Packages: Springer Book Archive