skip to main content
research-article

HASS: a scheduler for heterogeneous multicore systems

Published:21 April 2009Publication History
Skip Abstract Section

Abstract

Future heterogeneous single-ISA multicore processors will have an edge in potential performance per watt over comparable homogeneous processors. To fully tap into that potential, the OS scheduler needs to be heterogeneity-aware, so it can match jobs to cores according to characteristics of both. We propose a Heterogeneity-Aware Signature-Supported scheduling algorithm that does the matching using per-thread architectural signatures, which are compact summaries of threads' architectural properties collected offline. The resulting algorithm does not rely on dynamic profiling, and is comparatively simple and scalable. We implemented HASS in OpenSolaris, and achieved average workload speedups of up to 13%, matching best static assignment, achievable only by an oracle. We have also implemented a dynamic IPC-driven algorithm proposed earlier that relies on online profiling. We found that the complexity, load imbalance and associated performance degradation resulting from dynamic profiling are significant challenges to using this algorithm successfully. As a result it failed to deliver expected performance gains and to outperform HASS.

References

  1. K. Asanovic et al. The Landscape of Parallel Computing Research: A View from Berkeley. UC Berkeley Technical Report UCB/EECS-2006-183, 2006.Google ScholarGoogle Scholar
  2. S. Balakrishnan, R. Rajwar, M. Upton, and K. Lai. The Impact of Performance Asymmetry in Emerging Multicore Architectures. In Proceedings of the 32nd Annual International Symposium on Computer Architecture (Madison, Wisconsin USA, June 04-08, 2005). ISCA '05. IEEE Computer Society, Washington, DC, USA, 506--517. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. M. Becchi and P. Crowley. Dynamic Thread Assignment on Heterogeneous Multiprocessor Architectures. In Proceedings of the 3rd Conference on Computing Frontiers (Ischia, Italy, May 02-05, 2006). Computing Frontiers '06. ACM, New York, NY, USA, 29--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. E. Berg and E. Hargersten. StatCache: A Probabilistic Approach to Efficient and Accurate Data Locality Analysis. In Proceedings of the 2004 IEEE International Symposium on Performance Analysis of Systems and Software (Austin, Texas, USA, March 10-12, 2004). ISPASS '04. IEEE Computer Society, Washington, DC, USA, 20--27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. Borkar. Thousand Core Chips--A Technology Perspective. In Proceedings of the 44th Annual Conference on Design Automation (San Diego, California, USA, June 04-08, 2007). DAC '07. ACM, New York, NY, USA, 746--749. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. B. Cantrill, M. Shapiro, and A. Levinthal. Dynamic Instrumentation of Production Systems. In Proceedings of the USENIX Annual Technical Conference (Boston, MA, USA, June 27--July 02, 2004). USENIX '04. USENIX Association, Berkeley, CA, USA, 2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. D. Chandra, F. Guo, S. Kim, and Y. Solihin. Predicting Inter-Thread Cache Contention on a Multi-Processor Architecture. In Proceedings of the 11th International Symposium on High-Performance Computer Architecture (San Francisco, California, USA, February 12-16, 2005). HPCA '05. IEEE Computer Society, Washington, DC, USA, 340--351. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. C. Ding, Y. Zhong. Predicting Whole-program Locality through Reuse Distance Analysis. In Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language Design and Implementation (San Diego, California, USA, June 09-11, 2003). PLDI '03. ACM, New York, NY, USA, 245--257. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. V. Freeh et al. Analyzing the Energy-Time Trade-Off in High-Performance Computing Applications. IEEE Transactions on Parallel and Distributed Systems, 18, 6 (June 2007). IEEE Press, Piscataway, NJ, USA, 835--848. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. M. Hill and M. Marty. Amdahl's Law in the Multicore Era. IEEE Computer, 41, 7 (July 2008). IEEE Computer Society Press, Los Alamitos, CA, USA, 33--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Hill and A. Smith. Evaluating Associativity in CPU Caches. IEEE Transactions on Computers, 38, 12 (December 1989). IEEE Computer Society, Washington, DC, USA, 1612--1630. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. K. Hoste and L. Eeckhout. Microarchitecture-Independent Workload Characterization. IEEE Micro, 27(3), 2007. IEEE Computer Society Press, Los Alamitos, CA, USA, 63--72. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. E. Humenay, D. Tarjan, and K. Skadron. Impact of Process Variations on Multicore Performance Symmetry. In Proceedings of the Conference on Design, Automation and Test in Europe (Nice, France, April 16-20, 2007). DATE '07. EDA Consortium, San Jose, CA, USA, 1653--1658. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. R. Kumar et al. Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction. In Proceedings of the 36th Annual IEEE/ACM International Symposium on Microarchitecture (San Diego, California, USA, December 03-05, 2003). MICRO '03. IEEE Computer Society, Washington, DC, USA, 81. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. R. Kumar et al. Single-ISA Heterogeneous Multi-Core Architectures for Multithreaded Workload Performance. In Proceedings of the 31st Annual International Symposium on Computer Architecture (München, Germany, June 19-23, 2004). ISCA '04. IEEE Computer Society, Washington, DC, USA, 64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. T. Li, D. Baumberger, D.A. Koufaty, and Scott Hahn. Efficient Operating System Scheduling for Performance-Asymmetric Multi-Core Architectures. In Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (Reno, Nevada, USA, November 10-16, 2007). SC '07. ACM, New York, NY, USA, No. 53. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. C.K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. Reddi, K. Hazelwood. Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (Chicago, Illinois, USA, June 11-15, 2005). PLDI '05. ACM, New York, NY, USA, 190--200. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. J. Mogul et al. Using Asymmetric Single-ISA CMPs to Save Energy on Operating Systems. IEEE Micro, 28, 3 (May 2008). IEEE Computer Society Press, Los Alamitos, CA, USA, 26--41. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. D. Shelepov and A. Fedorova. Scheduling on Heterogeneous Multicore Processors Using Architectural Signatures. In Proceedings of the Workshop on the Interaction between Operating Systems and Computer Architecture, in conjunction with the 35th International Symposium on Computer Architecture (Beijing, China, June 21-25, 2008). WIOSCA '08.Google ScholarGoogle Scholar
  20. T. Sherwood, S. Sair, and B. Calder. Phase Tracking and Prediction. In Proceedings of the 30th Annual International Symposium on Computer Architecture (San Diego, California, USA, June 09-11, 2003). ISCA '03. ACM, New York, NY, USA, 336--349. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. A. Smith. A Comparative Study of Set Associative Memory Mapping Algorithms and Their Use for Cache and Main Memory. IEEE Transactions on Software Engineering, 4, 2 (March 1978). IEEE Press, Piscataway, NJ, USA, 121--130. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. R. Teodorescu and J. Torrellas. Variation-Aware Application Scheduling and Power Management for Chip Multiprocessors. In Proceedings of the 35th International Symposium on Computer Architecture (Beijing, China, June 21-25, 2008). ISCA '08. IEEE Computer Society, Washington, DC, USA, 363--374. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. HASS: a scheduler for heterogeneous multicore systems

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader