skip to main content
10.1145/2016604.2016647acmconferencesArticle/Chapter ViewAbstractPublication PagescfConference Proceedingsconference-collections
research-article

FACT: a framework for adaptive contention-aware thread migrations

Published: 03 May 2011 Publication History

Abstract

Thread scheduling in multi-core systems is a challenging problem because cores on a single chip usually share parts of the memory hierarchy, such as last-level caches, prefetchers and memory controllers, making threads running on different cores interfere with each other while competing for these resources. Data center service providers are interested in compressing the workload onto as few computing units as possible so as to utilize its resources most efficiently and conserve power. However, because memory hierarchy interference between threads is not managed by commercial operating systems, the data center operators still prefer running threads on different chips so as to avoid possible performance degradation due to interference.
In this work, we improved the system's throughput by minimizing inter-workload contention for memory hierarchy resources. We achieved this by implementing FACT, a Framework for Adaptive Contention-aware Thread migrations, which measures the relevant performance monitoring events online, learns to predict the effects of interference on performance of workloads, and then makes optimal thread scheduling decisions. We found that when instantiated with a fuzzy rule-based (FRB) predictive model, FACT achieves on average a 74% prediction accuracy on the new data. In experiments conducted on a quad-core machine running OpenSolaris, SPEC-cpu2006 workloads under FACT-FRB ran up to 11.6% faster than under the default OpenSolaris scheduler. FACT-FRB was also able to find the best combination of workloads more consistently than the state-of-the-art algorithms that aim to minimize contention for memory resources on each chip. Unlike these algorithms that based on fixed heuristics, FACT can be easily adapted to consider other performance factors so as to accommodate changes in architectural features and performance bottlenecks in future systems.

References

[1]
R. McDougall and J. Mauro. Solaris Internals#8482;: Solaris 10 and OpenSolaris Kernel Architecture, Prentice Hall Publications, Second Edition, July 2006.
[2]
R. McDougall, J. Mauro, and B. Gregg. Solaris Performance and Tools: DTrace and MDB Techniques for Solaris 10 and OpenSolaris, pages 20--59, Prentice Hall Publications, July 2006.
[3]
S. Boyd-Wickizer, R. Morris, and M. F. Kaashoek, Reinventing Scheduling for Multicore Systems. In the Proceedings of the 12th Workshop on Hot Topics in Operating Systems (HotOS-XII), Monte Verita, Switzerland, May 2009.
[4]
S. Zhuravlev, S. Blagodurov, and A. Fedorova, Addressing Shared Resource Contention in Multicore Processors via Scheduling. In Proceedings of the Fifteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2010), Pittsburgh, PA - March 13--17, 2010.
[5]
A. Fedorova, S. Blagodurov, and S. zhuravlev. Managing Contention for Shared Resources on Multicore Processors. Communications of the ACM, vol 53, no 2, February 2010. pp. 49--57.
[6]
R. Knauerhase, B. Hohlt, T. Li and S. Hahn. Using OS Observations to Improve Performance in Multicore Systems, in IEEE Micro, 283, 54--66, 2008.
[7]
R. Craig and P. lerous, Operating system support for multi-core processors, Review 2005 - Technical Trends, Ontario, Canada.
[8]
A. Fedorova, M. I. Seltzer, and M. D. Smith. Improving Performance Isolation on Chip Multiprocessors via an Operating System Scheduler. In Proceedings of the Sixteenth International Conference on Parallel Architectures and Compilation Techniques (PACT'07), pages 25--38, 2007.
[9]
X. Zhang, S. Dwarkadas, and K. Shen. Towards practical page coloring-based multicore cache management. In Proceedings of the 4th ACM European Conference on Computer Systems (EuroSys'09), pages 89--102, 2009.
[10]
S. Cho and L. Jin. Managing Distributed, Shared L2 Caches through OS-Level Page Allocation. In proceedings of the IEEE/ACM Int'l Symposium on Microarchitecture (MICRO), pp. 455 465, Orlando, FL, December 2006.
[11]
D. Chandra, F. Guo, S. Kim, and Y. Solihin. Predicting Inter-Thread Cache Contention on a Chip Multi-Processor Architecture. In HPCA'05: Proceedings of the 11th International Symposium on High-Performance Computer Architecture (HPCA), pages 340--351, 2005.
[12]
M. K. Qureshi and Y. N. Patt. Utility-based cache partitioning: A low-overhead, high-performance, runtime mechanism to partition shared caches. In MICRO 39: Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture, pages 423--432, 2006.
[13]
N. Rafique, W. T. Lim, and M. Thottethodi. Effective management of dram bandwidth in multicore processors. In PACT'07: Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques (PACT), pages 245--258, 2007.
[14]
G. E. Suh, S. Devadas, and L. Rudolph. A New Memory Monitoring Scheme for Memory-Aware Scheduling and Partitioning. In HPCA âĂŹ02: Proceedings of the 8th International Symposium on High-Performance Computer Architecture (HPCA), page 117, 2002.
[15]
E. Ebrahimi, C. J. Lee, O. Mutlu, and Y. N. Patt. Fairness via Source Throttling: A Configurable and High-Performance Fairness Substrate for Multi-Core Memory Systems. In Proceedings of the 15th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 335--346, Pittsburgh, PA, March 2010
[16]
E. Ipek, O. Mutlu, J. F. Martnez, and R. Caruana. Self Optimizing Memory Controllers: A Reinforcement Learning Approach. In Proceedings of the 35th International Symposium on Computer Architecture (ISCA), pages 39--50, Beijing, China, June 2008.
[17]
D. Tam, R. Azimi, and M. Stumm. Thread clustering: sharing-aware scheduling on SMP-CMP-SMT multiprocessors. In Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems, pages 47--58, New York, NY, USA, 2007.
[18]
N. Lakshminarayana, S. Rao, and H. Kim. Asymmetricity Aware Scheduling Algorithms for Asymmetric Processors. In Workshop on the Interaction between Operating Systems and Computer Architecture (WIOSCA), 2009.
[19]
R. Kumar, K. I. Farkas, N. P. Jouppi, P. Ranganathan, and D. M. Tullsen. Single-ISA Heterogeneous Multi-Core Architectures for Multithreaded Workload Performance. In Proc. of the 31st Annual International Symposium on Computer Architecture (ISCA), 2004.
[20]
R. Thekkath and S. J. Eggers. Impact of Sharing-Based Thread Placement on Multithreaded Architectures. In Proceedings of the International Symposium on Computer Architecture (ISCA), 1994.
[21]
D. Vengerov, "A Reinforcement Learning Framework for Online Data Migration in Hierarchical Storage Systems. Journal of Supercomputing, Volume 43, Number 1, pp. 1--19, January, 2008.
[22]
S. Chen, P. B. Gibbons, M. Kozuch, V. Liaskovitis, A. Ailamaki, G. E. Blelloch, B. Falsafi, L. Fix, N. Hardavellas, T. C. Mowry, and C. Wilkerson. Scheduling Threads for Constructive Cache Sharing on CMPs. In Proceedings of the 19th ACM Symposium on Parallel Algorithms and Architectures, pages 105--115. ACM, 2007.
[23]
A. Snavely, D. M. Tullsen. Symbiotic Jobscheduling for a Simultaneous Multithreading Processor. In Ninth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), November, 2000.
[24]
K. K. Pusukuri, D. Vengerov, A. Fedorova. A Methodology for Developing Simple and Robust Power Models using Performance Monitoring Events. In proceedings of WIOSCA'09, June 2009, Austin, Texas, USA.
[25]
A. Merkel, J. Stoess, and F. Bellosa. Resource-conscious Scheduling for Energy Efficiency on Multicore Processors. In proceedings of EuroSys 2010, April 2010, Paris, France.
[26]
R. Lee, X. Ding, F. Chen, Q. Lu, and X. Zhang. MCC-DB: minimizing cache conflicts in muli-core processors for databases. In Proceedings of 35th International Conference on Very Large Data Bases, (VLDB 2009), Lyon, France, August 24--28, 2009.
[27]
T. Hastie and R. Tibshirani and J. H. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer Series in Statistics, 2003, pp. 41--75, 193--222, 266--277 and 411--433.
[28]
Performance Analysis and Monitoring Using Hardware Counters. http://developers.sun.com/solaris/articles/hardware_counters.html
[29]
libcpc(3LIB) library http://docs.sun.com/app/docs/doc/816-5173/libcpc-3lib?a=view
[30]
libpctx(3LIB) library http://docs.sun.com/app/docs/doc/819-2242/libpctx-3lib?a=view
[31]
Bootstrap Relative Importance Measures. http://cran.rproject.org/web/packages/relaimpo/index.html
[32]
U. Gromping. Estimators of Relative Importance in Linear Regression Based on Variance Decomposition. The American Statistician, 2007, vol. 61, pages 139--147.
[33]
R lm() method. http://rss.acs.unt.edu/Rdoc/library/stats/html/lm.html
[34]
All subsets regression, http://hosho.ees.hokudai.ac.jp/~kubo/Rdoc/library/leaps/html/leaps.html
[35]
K-nearest neighbor algorithm. http://en.wikipedia.org/wiki/K-nearest_neighbor_algorithm
[36]
Weighted k-Nearest Neighbor Classifier. http://bm2.genes.nig.ac.jp/RGM2/R_current/library/kknn/man/kknn.html
[37]
Mean Absolute Percentage Error. http://en.wikipedia.org/wiki/Mean_absolute_percentage_error.
[38]
SPEC 2000 and SPEC 2006. http://www.spec.org/
[39]
stepAIC() http://stat.ethz.ch/R-manual/R-patched/library/MASS/html/stepAIC.html
[40]
R Tree-based Models. http://www.statmethods.net/advstats/cart.html
[41]
Decision Trees (Recursive Partitioning) http://en.wikipedia.org/wiki/Decision_tree_learning

Cited By

View all
  • (2024)Suppressing the Interference Within a Datacenter: Theorems, Metric and StrategyIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.335441835:5(732-750)Online publication date: May-2024
  • (2023)Ah-Q: Quantifying and Handling the Interference within a Datacenter from a System Perspective2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA56546.2023.10071128(471-484)Online publication date: Feb-2023
  • (2020)Work In Progress: Control-Flow Migration for Data-Locality Optimisation in Multi-Core Real-Time Systems2020 IEEE Real-Time Systems Symposium (RTSS)10.1109/RTSS49844.2020.00041(371-374)Online publication date: Dec-2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CF '11: Proceedings of the 8th ACM International Conference on Computing Frontiers
May 2011
268 pages
ISBN:9781450306980
DOI:10.1145/2016604
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 May 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. multicore
  2. operating systems
  3. scheduling
  4. supervised learning

Qualifiers

  • Research-article

Funding Sources

Conference

CF'11
Sponsor:
CF'11: Computing Frontiers Conference
May 3 - 5, 2011
Ischia, Italy

Acceptance Rates

Overall Acceptance Rate 273 of 785 submissions, 35%

Upcoming Conference

CF '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 30 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Suppressing the Interference Within a Datacenter: Theorems, Metric and StrategyIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.335441835:5(732-750)Online publication date: May-2024
  • (2023)Ah-Q: Quantifying and Handling the Interference within a Datacenter from a System Perspective2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA56546.2023.10071128(471-484)Online publication date: Feb-2023
  • (2020)Work In Progress: Control-Flow Migration for Data-Locality Optimisation in Multi-Core Real-Time Systems2020 IEEE Real-Time Systems Symposium (RTSS)10.1109/RTSS49844.2020.00041(371-374)Online publication date: Dec-2020
  • (2019)Energy-Efficient Thread Mapping for Heterogeneous Many-Core Systems via Dynamically Adjusting the Thread CountEnergies10.3390/en1207134612:7(1346)Online publication date: 8-Apr-2019
  • (2017)A Machine Learning Approach to Automatic Creation of Architecture-Sensitive Performance Heuristics2017 IEEE 19th International Conference on High Performance Computing and Communications; IEEE 15th International Conference on Smart City; IEEE 3rd International Conference on Data Science and Systems (HPCC/SmartCity/DSS)10.1109/HPCC-SmartCity-DSS.2017.3(18-25)Online publication date: Dec-2017
  • (2016)The Linux schedulerProceedings of the Eleventh European Conference on Computer Systems10.1145/2901318.2901326(1-16)Online publication date: 18-Apr-2016
  • (2016)EFSJournal of Parallel and Distributed Computing10.1016/j.jpdc.2016.03.00795:C(3-14)Online publication date: 1-Sep-2016
  • (2015)Realizing energy-efficient thread affinity configurations with supervised learningProceedings of the 2015 Sixth International Green and Sustainable Computing Conference (IGSC)10.1109/IGCC.2015.7393691(1-4)Online publication date: 14-Dec-2015
  • (2015)Thread Count Prediction ModelProceedings of the 2015 IEEE 21st International Conference on Parallel and Distributed Systems (ICPADS)10.1109/ICPADS.2015.64(456-464)Online publication date: 14-Dec-2015
  • (2015)Change Detection Based Parallelism Mapping: Exploiting Offline Models and Online AdaptationLanguages and Compilers for Parallel Computing10.1007/978-3-319-17473-0_14(208-223)Online publication date: 1-May-2015
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media