Abstract
With the potential for tens to hundreds of processing elements on future single chip multicore designs also comes the potential to execute a wider variety of input streams, or workloads. At the same time, the trend is for single users to utilize an entire single chip multicore computer. A central challenge for these computers is how to model and identify persistent changes in the input stream, or workload modes. Computer architects often model single program phases as Markov chains. We define workload modes and analyze and evaluate two modeling techniques, a Workload Classification Model (WCM) and a Hidden Markov Model (HMM). We include experimentation on a cell phone example, illustrating how WCM is, on average, 34 times more time efficient and 83% more space efficient than HMM, while improving overall performance by an average of 191% and being, on average, 56% more energy efficient. We found that even sub-optimal use of WCM can outperform HMM, further supporting the need for design time workload models. Our main contribution is to show how the design of single user multicore architecture to models of workloads that arise from single user usage patterns will be necessary as the complexity of applications and architectures grows. Thus, we advocate Workload Specific Processors as a new means of orienting single-user chip heterogeneous multiprocessors.
Similar content being viewed by others
References
Paul J.M., Otoom M., Somers M., Pieper S., Schulte M.J.: The emerging landscape of computer performance evaluation. Adv. Comput. 46, 235–280 (2009)
Flynn M.: Some computer organizations and their effectiveness. IEEE Trans. Comput. C-21, 948 (1972)
Sherwood, T., Perelman, E., Hamerly, G., Sair, S., Calder, B.: Discovering and exploiting program phases. In: Micro, pp. 84–93 (2003)
Lau, J., Schoenmackers, S., Calder, B.: Transition phase classification and prediction. In: HPCA, pp. 278–289 (2005)
Sherwood, T., Sair, S., Calder, B.: Phase tracking and prediction. In: ISCA, pp. 336–349 (2003)
Vandeputte, F., Eeckhout, L., DE Bosschere, K.: A detailed study on phase predictors. In: Cunha, J., Medeiros, P. (eds.) International Euro-Par Conference, LNCS, vol. 3648, pp. 571–581. Springer (2005)
Gheorghita S.V., Palkovic M., Hamers J., Vandecappelle A., Mamagkakis S., Basten T., Eeckhout L., Corporaal H., Catthor F., Vandeputte F., Bosschere K.D.: System-scenario-based design of dynamic embedded systems. ACM TODAES 14(1), 1–44 (2009)
Deshmukh, U., Sahula, V.: Interactive generalized semi Markov process model for evaluating arbitration schemes of SoC bus architectures. In: EUROSIM-UKSim, pp. 578–583 (2008)
Song, B., Ernemann, C., Yahyapour, R.: Parallel computer workload modeling with Markov chains. In: JSSPP, pp. 47–62 (2004)
Gschwind M., Hofstee H.P, Flaches B., Hopkins M., Watanabe Y., Yamazaki T.: Synergistic processing in cell’s multicore architecture. IEEE Micro 26(2), 10–24 (2006)
Paul J.M., Thomas D.E., Bobrek A.: Scenario-oriented design for single-chip heterogeneous multiprocessors. IEEE Trans. VLSI Syst. 14(8), 868–880 (2006)
EEMBC. http://www.eembc.org/. Accessed March 2011
Jha, N.K.: Low power system scheduling and synthesis. In: DAC, pp. 259–263 (2001)
Shin, D., Kim, J.: Optimizing intra-task voltage scheduling using data flow analysis. In: ASPDAC, pp. 703–708 (2005)
Meyer B.H., Pieper J.J., Paul J.M., Nelson J.E., Pieper S.M., Rowe A.G.: Power-performance simulation and design strategies for single-chip heterogeneous multiprocessors. IEEE Trans. Comput. 54(6), 684–697 (2005)
Brown, J.A., Tullsen, D.M.: The shared-thread multiprocessor. In: ICS, pp. 73–82 (2008)
SPEC—Standard Performance Evaluation Corporation: SPEC’s benchmarks and published results, http://www.spec.org/benchmarks.html#web (2008). Accessed Sep. 2010
Woo, S., Ohara, M., Torrie, E., Sing, J., Gupta, A.: The SPLASH-2 Programs: characterization and methodological considerations. In: ISCA, pp. 24–36 (2005)
Otoom, M., Paul, J.M.: Holistic Design and_Caching_in Mobile Computing. CODES+ISSS, pp. 115–120 (2008)
Calzarossa M., Massari L., Tessera D.: Workload characterization—issues and methodologies. In: Haring, G., Lindemann, C., Reiser, M. (eds) Performance Evaluation—Origins and Directions, vol. 1769 of Lecture Notes in Computer Science., pp. 459–484. Springer, Berlin (2000)
Calzarossa M., Serazzi G.: Workload characterization: a Survey. IEEE 8(81), 1136–1150 (1993)
Kumar, R., Tullsen, D.M., Jouppi, N.P.: Core architecture optimization for heterogeneous chip multiprocessors. In: PACT, pp. 23–32 (2006)
Joshi A., Phansalkar A., Eeckhout L., John L.: Measuring benchmark similarity using inherent program characteristics. IEEE Trans. Comput. 55(6), 769–782 (2006)
Hoste, K., Phansalkar, A., Georges A., John, L.: Performance prediction based on inherent program similarity. In: PACT, pp. 114–122 (2006)
Vandierendonck, H., De Bosschere, K.: Experiments with subsetting benchmark suites. In: Workshop on workload characterization, pp. 55–62 (2004)
Sevcik, K.: Characterization of parallelism in applications and their use in scheduling. In: ACM SIGMETRICS conference, pp. 171–180 (1989)
Majumdar S., Eager D., Bunt R.: Characterization of programs for scheduling in multiprogrammed parallel systems. Perform. Eval. 13(2), 109–130 (1991)
Ferrari, D.: On the foundations of artificial workload design. In: ACM SIGMETRICS Conference, pp. 8–14 (1984)
Calzarossa M., Haring G., Kotsis G., Merlo A., Tessera D.: A hierarchical approach to workload characterization for parallel systems. In: Hertzberger, B., Serazzi, G. (eds) High Performance Computing and Networking, LNCS, vol. 919, pp. 102–109. Springer, Berlin (1995)
Menasce, D., Almeida, V., Fonseca, R., Mendes, M.: Resource management policies for e-commerce servers. ACM SIGMETRICS Perform. Eval. Rev. 27–35 (2000)
Calzarossa M., Marie R., Trivedi K.: System performance with user behavior graphs. Perform. Eval. 11, 155–164 (1990)
Calzarossa M., Haring G., Serazzi G.: Workload modeling for computer networks. In: Kastens, U., Ramming, F. (eds) Architektur und Betrieb von Rechensystemen, pp. 324–339. Springer, Berlin (1988)
Chen, C.: Structuring and visualizing the www by generalized similarity analysis. Hypertext Hypermedia 177–186 (1997)
Hofmann R., Klar R., Mohr B., Quick A., Siegle M.: Distributed performance monitoring: methods, tools, and applications. IEEE Trans. Parallel Distrib. Syst. 5(6), 585–598 (1994)
Iglesias, J.A., Angelov, P., Ledezma, A., Sanchis, A.: Creating evolving user behavior profiles automatically. IEEE-T Knowl. Data Eng. ISSN 1041-4347 (2011)
Anderson J.: Learning and Memory: An Integrated Approach. Wiley, New York (1995)
Agrawal, R., Srikant, R.: Mining sequential patterns. In: IEEE 11th International Conference on Data Engineering. IEEE Computer Society Press (1995)
Kosala R., Blockeel H.: Web mining research: a survey. SIGKDD Explor. 2(1), 1–15 (2000)
Langley, P.: User modelling in adaptive interfaces. In: International Conference on User Modeling, pp. 357–370 (1999)
Perkowitz, M., Etzioni, O.: Adaptive web sites: an AI challenge. In: International Joint Conference on AI, pp. 16–23 (1997)
Perkowitz, M., Etzioni, O.: Adaptive Sites: Automatically Learning From User Access Patterns. WWW, poster no. 722 (1997)
Spiliopoulou M., Pohle C., Faulstich L.C.: Improving the effectiveness of a Web site with Web usage mining. WebKKD 1836/2000, 142–162 (2000)
Buechner, A.G., Anand, S.S., Mulvenna, M.D., Hughes, J.G.: Discovering internet marketing intelligence through web log mining. ACM SIGMOD Rec. 27(4) (1999)
Menascé, D.A., Almeida, V.A.F., Fonseca, R., Mendes, M.A.: A methodology for workload characterization of E-Commerce sites. ACM Conf. E-Commerce 119 (1999)
Ruffo, G., Schifanella, R., Ballocca, G., Politi, R.: Integrated techniques and tools for web mining, user profiling and benchmarking analysis. In: Web Mining: An Overview, ISBN: 81-314-0420-X
Borges, J., Levene, M.: Data mining of user navigation patterns. In: Web Usage Analysis and User Profiling, Lecture Notes in Computer Science, vol. 1836, pp. 92–111. Springer (2000)
Jain A., Dubes R.: Algorithms for Clustering Data. Prentice Hall, New Jersey (1988)
Calzarossa M., Serazzi G.: A characterization of the variation in time of workload arrival patterns. IEEE Trans. Comput. 34(2), 156–162 (1985)
Chiang, S-H., Vernon, M.K.: Characteristics of a large shared memory production workload. In: JSSPP, pp. 159–187 (2001)
Law A.M., Kelton W.D.: Simulation Modeling and Analysis, 3rd edn. McGraw Hill, New York (2000)
Hotovy, S.: Workload evolution on the Cornell Theory Center IBM SP2. In: Job Sched. Strat. for Parallel Proceedings, LNCS 1162, pp. 27–40. Springer (1996)
Ferrari D.: Workload characterization and selection in computer performance measurement. Computer 5(4), 18–24 (1972)
Sreenivasan K., Kleinman A.J.: On the construction of a representative synthetic workload. Commun. ACM 17(3), 127–133 (1974)
Agrawala A.K., Mohr J.M., Bryant R.M.: An approach to the workload characterization problem. Computer 9(6), 18–32 (1976)
Serazzi, G.: A functional and resource-oriented procedure for workload modeling. In: Kylstra, F.J. (ed.) Performance, pp. 345–361. North-Holland (1981)
Zhou, M., Smith, A.J.: Tracing Windows95. Technical Report No. UCB/CSD-99-1037, Computer Science Division, UC Berkeley (1998, November)
Agrawala, A.K., Mohr, J.M.: A Markovian model of a Job. CPEUG, pp. 119–126 (1978)
Haring, G.: On stochastic models of interactive workloads. In: Agrawala, A.K., Tripathi, S.K. (eds.) Performance, pp. 133–152. North-Holland (1983)
Carlson, B.M., Wagner, T.D., Dowdy, L.W., Worley, P.H.: Speedup properties of phases in the execution profile of distributed parallel Programs. In: Pooley, R., Hillston, J. (eds.) Modeling Techniques and Tools for Computer Performance Evaluation, pp. 83–95 (1992)
Waheed, A., Yan, J.: Workload characterization of CFD applications using partial differential equation solvers. In: Workshop on Workload Characterization in High-Performance Computing Environments (1998)
Hoste K., Eeckhout L.: Microarchitecture-independent workload characterization. IEEE Micro 27(3), 63–72 (2007)
SimpleScalar http://www.simplescalar.com Accessed on Jan 2011
Marculescu, R., Marculescu, D., Pedram, M.: Composite sequence compaction for finite-state machines using block entropy and high-order Markov models. In: ISLPED, pp. 190–195 (1997)
Tan, Y., Qiu, Q.: A framework of stochastic power management using hidden Markov model. In: DATE, pp. 92–97 (2008)
Rabiner L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. IEEE 77(2), 257–285 (1989)
Analog Devices, http://www.analog.com/en/epProd/0,ADSP-BF533,00.html Accessed on June 2010
PNX17xx Series, http://www.nxp.com/pip/PNX17XX_SER_N_1.html Accessed on June 2010
AMD-K6 Series, http://www.amd.com/epd/processors/6.32bitproc Accessed on June 2010
Mamidipaka, M., Dutt, N.: eCacti: An Enhanced Power Estimation Model for On-chip Caches. Technical Report #04-28, UCI, (2004)
Paul J.M., Meyer B.H.: Amdahl’s law revisited for single chip systems. IJPP 35(2), 101–123 (2007)
Milojicic, D., Douglis, F., Paindaveine, Y., Wheeler, R., Zhou, S.: Process migration survey. Comput Surv. 241–299 (2000)
Johnson F.R., Paul J.M.: Interrupt modeling for efficient high-level scheduler design space exploration. ACM TODAES 13(1), 1–22 (2008)
Paul J.M., Thomas D.E., Cassidy A.S.: High-level modeling and simulation of single-chip programmable heterogeneous multiprocessors. ACM TODAES 10(3), 431–461 (2005)
Bobrek, A., Paul, J.M., Thomas, D.E.: Stochastic contention level simulation for single chip heterogeneous multiprocessors. IEEE Trans. Comput. 1402–1418 (2010)
Paul, J.M., Bobrek, A., Nelson, J.E., Pieper, J.J., Thomas, D.E.: Schedulers as model-based design elements in programmable heterogeneous multiprocessors. In: DAC, pp. 408–411 (2003)
Bobrek, A., Pieper, J.J., Nelson, J.E., Paul, J.M., Thomas, D.E.: Modeling shared resource contention using a hybrid simulation/analytical approach. In: DATE, pp. 1144–1149 (2004)
Covington, R., Jump, J., Sinclair, J.: Cross-profiling as an efficient technique in simulating parallel computer systems. In: Computer Software and Applications Conference, pp. 75–80 (1989)
Bammi, J.R., Kruijtzer, W., Lavagno, L., Harcourt, E., Lazarescu, M.T.: Software performance estimation strategies in a system-level design tool. In: CODES, pp. 82–86 (2000)
Elliot R.J., Aggoun L., Moore J.: Hidden Markov Models, Estimation and Control. Springer, New York (1995)
Lahiri, K., Raghunathan, A., Lakshminarayana, G.: LOTTERYBUS: a new high-performance communication architecture for system-on-chip designs. In: DAC, pp. 15–20 (2001)
Pieper, J.J., Mellan, A., Paul, J.M., Thomas, D.E., Karim, F.: High level cache simulation for heterogeneous multiprocessors. In: DAC, pp. 287–292 (2004)
The Future of Computing, According to Intel. Available online: http://www.technologyreview.com/business/19432/page1. Accessed Jan 2011
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Otoom, M., Paul, J.M. Workload Mode Identification for Chip Heterogeneous Multiprocessors. Int J Parallel Prog 40, 184–224 (2012). https://doi.org/10.1007/s10766-011-0175-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10766-011-0175-4