Skip to main content

Advertisement

Log in

Workload Mode Identification for Chip Heterogeneous Multiprocessors

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

With the potential for tens to hundreds of processing elements on future single chip multicore designs also comes the potential to execute a wider variety of input streams, or workloads. At the same time, the trend is for single users to utilize an entire single chip multicore computer. A central challenge for these computers is how to model and identify persistent changes in the input stream, or workload modes. Computer architects often model single program phases as Markov chains. We define workload modes and analyze and evaluate two modeling techniques, a Workload Classification Model (WCM) and a Hidden Markov Model (HMM). We include experimentation on a cell phone example, illustrating how WCM is, on average, 34 times more time efficient and 83% more space efficient than HMM, while improving overall performance by an average of 191% and being, on average, 56% more energy efficient. We found that even sub-optimal use of WCM can outperform HMM, further supporting the need for design time workload models. Our main contribution is to show how the design of single user multicore architecture to models of workloads that arise from single user usage patterns will be necessary as the complexity of applications and architectures grows. Thus, we advocate Workload Specific Processors as a new means of orienting single-user chip heterogeneous multiprocessors.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Paul J.M., Otoom M., Somers M., Pieper S., Schulte M.J.: The emerging landscape of computer performance evaluation. Adv. Comput. 46, 235–280 (2009)

    Article  Google Scholar 

  2. Flynn M.: Some computer organizations and their effectiveness. IEEE Trans. Comput. C-21, 948 (1972)

    Article  Google Scholar 

  3. Sherwood, T., Perelman, E., Hamerly, G., Sair, S., Calder, B.: Discovering and exploiting program phases. In: Micro, pp. 84–93 (2003)

  4. Lau, J., Schoenmackers, S., Calder, B.: Transition phase classification and prediction. In: HPCA, pp. 278–289 (2005)

  5. Sherwood, T., Sair, S., Calder, B.: Phase tracking and prediction. In: ISCA, pp. 336–349 (2003)

  6. Vandeputte, F., Eeckhout, L., DE Bosschere, K.: A detailed study on phase predictors. In: Cunha, J., Medeiros, P. (eds.) International Euro-Par Conference, LNCS, vol. 3648, pp. 571–581. Springer (2005)

  7. Gheorghita S.V., Palkovic M., Hamers J., Vandecappelle A., Mamagkakis S., Basten T., Eeckhout L., Corporaal H., Catthor F., Vandeputte F., Bosschere K.D.: System-scenario-based design of dynamic embedded systems. ACM TODAES 14(1), 1–44 (2009)

    Article  Google Scholar 

  8. Deshmukh, U., Sahula, V.: Interactive generalized semi Markov process model for evaluating arbitration schemes of SoC bus architectures. In: EUROSIM-UKSim, pp. 578–583 (2008)

  9. Song, B., Ernemann, C., Yahyapour, R.: Parallel computer workload modeling with Markov chains. In: JSSPP, pp. 47–62 (2004)

  10. Gschwind M., Hofstee H.P, Flaches B., Hopkins M., Watanabe Y., Yamazaki T.: Synergistic processing in cell’s multicore architecture. IEEE Micro 26(2), 10–24 (2006)

    Article  Google Scholar 

  11. Paul J.M., Thomas D.E., Bobrek A.: Scenario-oriented design for single-chip heterogeneous multiprocessors. IEEE Trans. VLSI Syst. 14(8), 868–880 (2006)

    Article  Google Scholar 

  12. EEMBC. http://www.eembc.org/. Accessed March 2011

  13. Jha, N.K.: Low power system scheduling and synthesis. In: DAC, pp. 259–263 (2001)

  14. Shin, D., Kim, J.: Optimizing intra-task voltage scheduling using data flow analysis. In: ASPDAC, pp. 703–708 (2005)

  15. Meyer B.H., Pieper J.J., Paul J.M., Nelson J.E., Pieper S.M., Rowe A.G.: Power-performance simulation and design strategies for single-chip heterogeneous multiprocessors. IEEE Trans. Comput. 54(6), 684–697 (2005)

    Article  Google Scholar 

  16. Brown, J.A., Tullsen, D.M.: The shared-thread multiprocessor. In: ICS, pp. 73–82 (2008)

  17. SPEC—Standard Performance Evaluation Corporation: SPEC’s benchmarks and published results, http://www.spec.org/benchmarks.html#web (2008). Accessed Sep. 2010

  18. Woo, S., Ohara, M., Torrie, E., Sing, J., Gupta, A.: The SPLASH-2 Programs: characterization and methodological considerations. In: ISCA, pp. 24–36 (2005)

  19. Otoom, M., Paul, J.M.: Holistic Design and_Caching_in Mobile Computing. CODES+ISSS, pp. 115–120 (2008)

  20. Calzarossa M., Massari L., Tessera D.: Workload characterization—issues and methodologies. In: Haring, G., Lindemann, C., Reiser, M. (eds) Performance Evaluation—Origins and Directions, vol. 1769 of Lecture Notes in Computer Science., pp. 459–484. Springer, Berlin (2000)

    Google Scholar 

  21. Calzarossa M., Serazzi G.: Workload characterization: a Survey. IEEE 8(81), 1136–1150 (1993)

    Article  Google Scholar 

  22. Kumar, R., Tullsen, D.M., Jouppi, N.P.: Core architecture optimization for heterogeneous chip multiprocessors. In: PACT, pp. 23–32 (2006)

  23. Joshi A., Phansalkar A., Eeckhout L., John L.: Measuring benchmark similarity using inherent program characteristics. IEEE Trans. Comput. 55(6), 769–782 (2006)

    Article  Google Scholar 

  24. Hoste, K., Phansalkar, A., Georges A., John, L.: Performance prediction based on inherent program similarity. In: PACT, pp. 114–122 (2006)

  25. Vandierendonck, H., De Bosschere, K.: Experiments with subsetting benchmark suites. In: Workshop on workload characterization, pp. 55–62 (2004)

  26. Sevcik, K.: Characterization of parallelism in applications and their use in scheduling. In: ACM SIGMETRICS conference, pp. 171–180 (1989)

  27. Majumdar S., Eager D., Bunt R.: Characterization of programs for scheduling in multiprogrammed parallel systems. Perform. Eval. 13(2), 109–130 (1991)

    Article  MathSciNet  MATH  Google Scholar 

  28. Ferrari, D.: On the foundations of artificial workload design. In: ACM SIGMETRICS Conference, pp. 8–14 (1984)

  29. Calzarossa M., Haring G., Kotsis G., Merlo A., Tessera D.: A hierarchical approach to workload characterization for parallel systems. In: Hertzberger, B., Serazzi, G. (eds) High Performance Computing and Networking, LNCS, vol. 919, pp. 102–109. Springer, Berlin (1995)

    Chapter  Google Scholar 

  30. Menasce, D., Almeida, V., Fonseca, R., Mendes, M.: Resource management policies for e-commerce servers. ACM SIGMETRICS Perform. Eval. Rev. 27–35 (2000)

  31. Calzarossa M., Marie R., Trivedi K.: System performance with user behavior graphs. Perform. Eval. 11, 155–164 (1990)

    Article  Google Scholar 

  32. Calzarossa M., Haring G., Serazzi G.: Workload modeling for computer networks. In: Kastens, U., Ramming, F. (eds) Architektur und Betrieb von Rechensystemen, pp. 324–339. Springer, Berlin (1988)

    Chapter  Google Scholar 

  33. Chen, C.: Structuring and visualizing the www by generalized similarity analysis. Hypertext Hypermedia 177–186 (1997)

  34. Hofmann R., Klar R., Mohr B., Quick A., Siegle M.: Distributed performance monitoring: methods, tools, and applications. IEEE Trans. Parallel Distrib. Syst. 5(6), 585–598 (1994)

    Article  Google Scholar 

  35. Iglesias, J.A., Angelov, P., Ledezma, A., Sanchis, A.: Creating evolving user behavior profiles automatically. IEEE-T Knowl. Data Eng. ISSN 1041-4347 (2011)

  36. Anderson J.: Learning and Memory: An Integrated Approach. Wiley, New York (1995)

    Google Scholar 

  37. Agrawal, R., Srikant, R.: Mining sequential patterns. In: IEEE 11th International Conference on Data Engineering. IEEE Computer Society Press (1995)

  38. Kosala R., Blockeel H.: Web mining research: a survey. SIGKDD Explor. 2(1), 1–15 (2000)

    Article  Google Scholar 

  39. Langley, P.: User modelling in adaptive interfaces. In: International Conference on User Modeling, pp. 357–370 (1999)

  40. Perkowitz, M., Etzioni, O.: Adaptive web sites: an AI challenge. In: International Joint Conference on AI, pp. 16–23 (1997)

  41. Perkowitz, M., Etzioni, O.: Adaptive Sites: Automatically Learning From User Access Patterns. WWW, poster no. 722 (1997)

  42. Spiliopoulou M., Pohle C., Faulstich L.C.: Improving the effectiveness of a Web site with Web usage mining. WebKKD 1836/2000, 142–162 (2000)

    Google Scholar 

  43. Buechner, A.G., Anand, S.S., Mulvenna, M.D., Hughes, J.G.: Discovering internet marketing intelligence through web log mining. ACM SIGMOD Rec. 27(4) (1999)

  44. Menascé, D.A., Almeida, V.A.F., Fonseca, R., Mendes, M.A.: A methodology for workload characterization of E-Commerce sites. ACM Conf. E-Commerce 119 (1999)

  45. Ruffo, G., Schifanella, R., Ballocca, G., Politi, R.: Integrated techniques and tools for web mining, user profiling and benchmarking analysis. In: Web Mining: An Overview, ISBN: 81-314-0420-X

  46. Borges, J., Levene, M.: Data mining of user navigation patterns. In: Web Usage Analysis and User Profiling, Lecture Notes in Computer Science, vol. 1836, pp. 92–111. Springer (2000)

  47. Jain A., Dubes R.: Algorithms for Clustering Data. Prentice Hall, New Jersey (1988)

    MATH  Google Scholar 

  48. Calzarossa M., Serazzi G.: A characterization of the variation in time of workload arrival patterns. IEEE Trans. Comput. 34(2), 156–162 (1985)

    Article  Google Scholar 

  49. Chiang, S-H., Vernon, M.K.: Characteristics of a large shared memory production workload. In: JSSPP, pp. 159–187 (2001)

  50. Law A.M., Kelton W.D.: Simulation Modeling and Analysis, 3rd edn. McGraw Hill, New York (2000)

    Google Scholar 

  51. Hotovy, S.: Workload evolution on the Cornell Theory Center IBM SP2. In: Job Sched. Strat. for Parallel Proceedings, LNCS 1162, pp. 27–40. Springer (1996)

  52. Ferrari D.: Workload characterization and selection in computer performance measurement. Computer 5(4), 18–24 (1972)

    Article  Google Scholar 

  53. Sreenivasan K., Kleinman A.J.: On the construction of a representative synthetic workload. Commun. ACM 17(3), 127–133 (1974)

    Article  Google Scholar 

  54. Agrawala A.K., Mohr J.M., Bryant R.M.: An approach to the workload characterization problem. Computer 9(6), 18–32 (1976)

    Article  MATH  Google Scholar 

  55. Serazzi, G.: A functional and resource-oriented procedure for workload modeling. In: Kylstra, F.J. (ed.) Performance, pp. 345–361. North-Holland (1981)

  56. Zhou, M., Smith, A.J.: Tracing Windows95. Technical Report No. UCB/CSD-99-1037, Computer Science Division, UC Berkeley (1998, November)

  57. Agrawala, A.K., Mohr, J.M.: A Markovian model of a Job. CPEUG, pp. 119–126 (1978)

  58. Haring, G.: On stochastic models of interactive workloads. In: Agrawala, A.K., Tripathi, S.K. (eds.) Performance, pp. 133–152. North-Holland (1983)

  59. Carlson, B.M., Wagner, T.D., Dowdy, L.W., Worley, P.H.: Speedup properties of phases in the execution profile of distributed parallel Programs. In: Pooley, R., Hillston, J. (eds.) Modeling Techniques and Tools for Computer Performance Evaluation, pp. 83–95 (1992)

  60. Waheed, A., Yan, J.: Workload characterization of CFD applications using partial differential equation solvers. In: Workshop on Workload Characterization in High-Performance Computing Environments (1998)

  61. Hoste K., Eeckhout L.: Microarchitecture-independent workload characterization. IEEE Micro 27(3), 63–72 (2007)

    Article  Google Scholar 

  62. SimpleScalar http://www.simplescalar.com Accessed on Jan 2011

  63. Marculescu, R., Marculescu, D., Pedram, M.: Composite sequence compaction for finite-state machines using block entropy and high-order Markov models. In: ISLPED, pp. 190–195 (1997)

  64. Tan, Y., Qiu, Q.: A framework of stochastic power management using hidden Markov model. In: DATE, pp. 92–97 (2008)

  65. Rabiner L.R.: A tutorial on hidden Markov models and selected applications in speech recognition. IEEE 77(2), 257–285 (1989)

    Article  Google Scholar 

  66. Analog Devices, http://www.analog.com/en/epProd/0,ADSP-BF533,00.html Accessed on June 2010

  67. PNX17xx Series, http://www.nxp.com/pip/PNX17XX_SER_N_1.html Accessed on June 2010

  68. AMD-K6 Series, http://www.amd.com/epd/processors/6.32bitproc Accessed on June 2010

  69. Mamidipaka, M., Dutt, N.: eCacti: An Enhanced Power Estimation Model for On-chip Caches. Technical Report #04-28, UCI, (2004)

  70. Paul J.M., Meyer B.H.: Amdahl’s law revisited for single chip systems. IJPP 35(2), 101–123 (2007)

    MATH  Google Scholar 

  71. Milojicic, D., Douglis, F., Paindaveine, Y., Wheeler, R., Zhou, S.: Process migration survey. Comput Surv. 241–299 (2000)

  72. Johnson F.R., Paul J.M.: Interrupt modeling for efficient high-level scheduler design space exploration. ACM TODAES 13(1), 1–22 (2008)

    Article  Google Scholar 

  73. Paul J.M., Thomas D.E., Cassidy A.S.: High-level modeling and simulation of single-chip programmable heterogeneous multiprocessors. ACM TODAES 10(3), 431–461 (2005)

    Article  Google Scholar 

  74. Bobrek, A., Paul, J.M., Thomas, D.E.: Stochastic contention level simulation for single chip heterogeneous multiprocessors. IEEE Trans. Comput. 1402–1418 (2010)

  75. Paul, J.M., Bobrek, A., Nelson, J.E., Pieper, J.J., Thomas, D.E.: Schedulers as model-based design elements in programmable heterogeneous multiprocessors. In: DAC, pp. 408–411 (2003)

  76. Bobrek, A., Pieper, J.J., Nelson, J.E., Paul, J.M., Thomas, D.E.: Modeling shared resource contention using a hybrid simulation/analytical approach. In: DATE, pp. 1144–1149 (2004)

  77. Covington, R., Jump, J., Sinclair, J.: Cross-profiling as an efficient technique in simulating parallel computer systems. In: Computer Software and Applications Conference, pp. 75–80 (1989)

  78. Bammi, J.R., Kruijtzer, W., Lavagno, L., Harcourt, E., Lazarescu, M.T.: Software performance estimation strategies in a system-level design tool. In: CODES, pp. 82–86 (2000)

  79. Elliot R.J., Aggoun L., Moore J.: Hidden Markov Models, Estimation and Control. Springer, New York (1995)

    Google Scholar 

  80. Lahiri, K., Raghunathan, A., Lakshminarayana, G.: LOTTERYBUS: a new high-performance communication architecture for system-on-chip designs. In: DAC, pp. 15–20 (2001)

  81. Pieper, J.J., Mellan, A., Paul, J.M., Thomas, D.E., Karim, F.: High level cache simulation for heterogeneous multiprocessors. In: DAC, pp. 287–292 (2004)

  82. The Future of Computing, According to Intel. Available online: http://www.technologyreview.com/business/19432/page1. Accessed Jan 2011

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mwaffaq Otoom.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Otoom, M., Paul, J.M. Workload Mode Identification for Chip Heterogeneous Multiprocessors. Int J Parallel Prog 40, 184–224 (2012). https://doi.org/10.1007/s10766-011-0175-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10766-011-0175-4

Keywords