research-article

Process scheduling for future multicore processors

Authors:
Thomas Canhao Xu

Turku Center for Computer Science (TUCS), Turku, Finland

Turku Center for Computer Science (TUCS), Turku, Finland
View Profile

,
Pasi Liljeberg

University of Turku, Turku, Finland

University of Turku, Turku, Finland
View Profile

,
Hannu Tenhunen

University of Turku, Turku, Finland

University of Turku, Turku, Finland
View Profile

INA-OCMC '11: Proceedings of the Fifth International Workshop on Interconnection Network Architecture: On-Chip, Multi-ChipJanuary 2011Pages 15–18https://doi.org/10.1145/1930037.1930042

Published:23 January 2011Publication History

INA-OCMC '11: Proceedings of the Fifth International Workshop on Interconnection Network Architecture: On-Chip, Multi-Chip

Pages 15–18

ABSTRACT

In this paper, we study and analyze process scheduling problems for future multicore processors. It is expected that hundreds or even thousands of cores will be integrated on a single chip, known as a Chip Multiprocessor (CMP). However, operating system process scheduling, one of the most important design issues for CMP systems, has not been well addressed. We define a model for future CMPs, based on which a scheduling algorithm is proposed to reduce on-chip communication latencies and improve performance. The impact of memory access and inter process communication (IPC) in scheduling are analyzed. We explore six typical core allocation strategies. Results show that, a strategy with a balanced consideration of both IPC and memory access out-performs other strategies, the two metrics (misses per thousand instructions and cache hit latencies) are reduced up to 25.97% and 13.11%, respectively.

References

D. Abts, N. D. E. Jerger, J. Kim, D. Gibson, and M. H. Lipasti. Achieving predictable performance through better memory controller placement in many-core cmps. In Proc. of the 36th ISCA, 2009. Google ScholarDigital Library
D. H. Bailey. Ffts in external or hierarchical memory. The Journal of Supercomputing, 4:23--35, 1990. 10.1007/BF00162341. Google ScholarDigital Library
L. Benini and G. D. Micheli. Networks on chips: A new soc paradigm. IEEE Computer, 35(1):70--78, January 2002. Google ScholarDigital Library
Y.-J. Chen, C.-L. Yang, and Y.-S. Chang. An architectural co-synthesis algorithm for energy-aware network-on-chip design. J. Syst. Archit., 55(5--6):299--309, 2009. Google ScholarDigital Library
T. Corporation, August 2010. http://www.tilera.com.Google Scholar
B. R. Gaeke, P. Husbands, X. S. Li, L. Oliker, K. A. Yelick, and R. Biswas. Memory-intensive benchmarks: Iram vs. cache-based machines. In Proceedings of the 16th IPDPS, page 203, April 2002. Google ScholarDigital Library
J. Hu and R. Marculescu. Energy-aware communication and task scheduling for network-on-chip architectures under real-time constraints. In DATE '04, page 10234, Washington, DC, USA, 2004. IEEE Computer Society. Google ScholarDigital Library
Intel. Single-chip cloud computer, May 2010. http://techresearch.intel.com/articles/Tera-Scale/1826.htm.Google Scholar
J. Laudon and D. Lenoski. The sgi origin: a ccnuma highly scalable server. In Proc. of the 24th ISCA, pages 241--251, June 1997. Google ScholarDigital Library
T. Lei and S. Kumar. A two-step genetic algorithm for mapping task graphs to a network on chip architecture. In DSD, 2003, pages 180--187, sep. 2003. Google ScholarDigital Library
S. T. Leutenegger and M. K. Vernon. The performance of multiprogrammed multiprocessor scheduling algorithms. In Proc. of the 1990 ACM SIGMETRICS Conf., pages 226--236, April 1990. Google ScholarDigital Library
P. Magnusson, M. Christensson, J. Eskilson, D. Forsgren, G. Hallberg, J. Hogberg, F. Larsson, A. Moestedt, and B. Werner. Simics: A full system simulation platform. Computer, 35(2):50--58, 2002. Google ScholarDigital Library
P. Schmid and A. Roos. Core i7 memory scaling: From ddr3-800 to ddr3-1600, 2009. Tom's Hardware.Google Scholar
D. D. Sharma and D. K. Pradhan. Processor allocation in hypercube multicomputers: Fast and efficient strategies for cubic and noncubic allocation. IEEE TPDS, 6(10):1108--1123, October 1995. Google ScholarDigital Library
S. C. Woo, J. P. Singh, and J. L. Hennessy. The performance advantages of integrating block data transfer in cache-coherent multiprocessors. In ASPLOS-VI, pages 219--229, New York, NY, USA, 1994. ACM. Google ScholarDigital Library
T. C. Xu, A. W. Yin, P. Liljeberg, and H. Tenhunen. Operating system processor scheduler design for future chip multiprocessor. In 23th ARCS, pages 69--76, Berlin-Offenbach, Germany, 2010. VDE Verlag GMBH.Google Scholar

Index Terms

Process scheduling for future multicore processors
1. Computer systems organization
  1. Architectures
    1. Parallel architectures
2. Hardware
  1. Integrated circuits
    1. Interconnect

Recommendations

Balanced Prefetching Aggressiveness Controller for NoC-based Multiprocessor
SBCCI '14: Proceedings of the 27th Symposium on Integrated Circuits and Systems Design

The performance gap between memory hierarchy and processor is a well-known issue and the prefetching approach is often used to minimize this problem. This technique performs a data prefetch in memory and makes it available in the private cache before ...
Read More
Reactive NUCA: near-optimal block placement and replication in distributed caches

Increases in on-chip communication delay and the large working sets of server and scientific workloads complicate the design of the on-chip last-level cache for multicore processors. The large working sets favor a shared cache design that maximizes the ...
Read More
Scalable Hybrid Wireless Network-on-Chip Architectures for Multicore Systems

Multicore platforms are emerging trends in the design of System-on-Chips (SoCs). Interconnect fabrics for these multicore SoCs play a crucial role in achieving the target performance. The Network-on-Chip (NoC) paradigm has been proposed as a promising ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
INA-OCMC '11: Proceedings of the Fifth International Workshop on Interconnection Network Architecture: On-Chip, Multi-Chip
January 2011
44 pages
ISBN:9781450302722
DOI:10.1145/1930037
General Chairs:
José Flich
Techn. University of Valencia, Spain
,
Davide Bertozzi
University of Ferrara, Italy
,
Program Chairs:
Tor Skeie
Simula Research Labs, Norway
,
Daniele Ludovici
TUDelft, The Netherlands and University of Ferrara, Italy
Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 23 January 2011
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
chip multiprocessor
multicore
multiprocessor
network-on-chip
scheduling
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate12of27submissions,44%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 247
  Total Downloads
- Downloads (Last 12 months)5
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Process scheduling for future multicore processors

INA-OCMC '11: Proceedings of the Fifth International Workshop on Interconnection Network Architecture: On-Chip, Multi-Chip

ABSTRACT

References

Cited By

Index Terms

Recommendations

Balanced Prefetching Aggressiveness Controller for NoC-based Multiprocessor

Reactive NUCA: near-optimal block placement and replication in distributed caches

Scalable Hybrid Wireless Network-on-Chip Architectures for Multicore Systems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Process scheduling for future multicore processors

INA-OCMC '11: Proceedings of the Fifth International Workshop on Interconnection Network Architecture: On-Chip, Multi-Chip

ABSTRACT

References

Cited By

Index Terms

Recommendations

Balanced Prefetching Aggressiveness Controller for NoC-based Multiprocessor

Reactive NUCA: near-optimal block placement and replication in distributed caches

Scalable Hybrid Wireless Network-on-Chip Architectures for Multicore Systems

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media