research-article

Fine grain thread scheduling on multicore processors: cores with multiple functional units

Authors:
Munish Bhatia

Birla Institute of Technology and Science Pilani, Pilani, India

Birla Institute of Technology and Science Pilani, Pilani, India
View Profile

,
D. C. Kiran

Birla Institute of Technology and Science Pilani, Pilani, India

Birla Institute of Technology and Science Pilani, Pilani, India
View Profile

,
J. P. Misra

Birla Institute of Technology and Science Pilani, Pilani, India

Birla Institute of Technology and Science Pilani, Pilani, India
View Profile

,
S. Gurunarayanan

Birla Institute of Technology and Science Pilani, Pilani, India

Birla Institute of Technology and Science Pilani, Pilani, India
View Profile

Compute '13: Proceedings of the 6th ACM India Computing ConventionAugust 2013Article No.: 20Pages 1–6https://doi.org/10.1145/2522548.2523137

Published:22 August 2013Publication History

Compute '13: Proceedings of the 6th ACM India Computing Convention

Pages 1–6

ABSTRACT

The proposed work discusses a global scheduling technique for multicore processors with specific focus on processor cores having multiple functional units. The design philosophy of the multicore architecture is to accommodate more cores with more execution capabilities on a chip by reducing other complex and redundant circuits. Due to the simplicity of hardware on the chip of multicore processor, the onus of detecting and exploiting the instruction level parallelism (ILP) in the program lies on the complier. Following work proposes a scheduling technique which is used to schedule the instructions onto multiple cores on chip each having multiple functional units. The goal is achieved by dissecting each basic block of the program's control flow graph (CFG) into sub-divisions called sub-blocks. These sub-blocks are then analyzed for the break-up of instructions on the basis of instruction type (Integer or Floating Point) and then they are scheduled onto different cores while trying to get a balanced trade-off between communication costs amongst the cores. The scheduler provides enough or approximately equal number of integer and floating point instructions to each core which may be executed in parallel on the core's multiple functional units (integer unit and floating point units), thus taking advantage of the core's architecture.

References

John L, Hennessy, David A Patterson, Computer Architecture: A Quantitative Approach, Morgan Kaufmann, San Francisco (2011). Google ScholarDigital Library
M. D. Hill and M. R. Marty. Amdahl's law in the multicore era. IEEE Computer, pp. 33--38, 2008. Google ScholarDigital Library
Dong Hyuk Woo, Hsien-hsin S. Lee, Extending Amdahl's Law for Energy-Efficient Computing in the Many-Core Era, IEEE Computer, pp. 24--31, 2008. Google ScholarDigital Library
D. C. Kiran, S. Gurunarayanan, and J. P. Misra, Taming compiler to work with multicore processors, IEEE Conference on Process Automation, Control and Computing, 2011.Google ScholarCross Ref
D. C. Kiran, S. Gurunarayanan, and J. P. Misra, Compiler Driven Inter Block Parallelism for Multicore Processors. In 6th International Conference on Information Processing, published in the Communications in Computer and Information Science (CCIS), Springer-Verlag, August 2012.Google Scholar
R. Cytron, J. Ferrante, B. K. Rosen, M. N. Wegman, and F. K. Zadeck. Efficient computing static single assignment form and the control dependence graph. ACM Transaction on Programming Languages and Systems, 13(4),pp.451--490,1991. Google ScholarDigital Library
D. C. Kiran, B. Radheshyam, Gurunarayanan, and J. P. Misra, Compiler assisted dynamic scheduling for multicore processors, IEEE Conference on Process Automation, Control and Computing, 2011.Google ScholarCross Ref
D. C. Kiran, S. Gurunarayanan, Faizan Khaliq, and Abhijeet Nawal, Compiler Efficient and Power Aware Instruction Level Parallelism for Multicore Architectures. In The International Eco-friendly Computing and Communication Systems, published in the Communications in Computer and Information Science (CCIS), Springer-Verlag, pp.9--17 August 2012.Google Scholar
Fisher, J. A. The VLIW Machine: A Multiprocessor for Compiling Scientific Code, Computer, vol.17, no.7, pp.45--53, July 1984. Google ScholarDigital Library
J. Babb, M. Frank, V. Lee, E. Waingold, R. Barua, M. Taylor J. Kim, S. Devabhaktuni, A. Agarwal, The RAW benchmark suite: computation structures for general purpose computing, Proceedings of the 5th IEEE Symposium on FPGA-Based Custom Computing Machines, pp.134, 1997. Google ScholarDigital Library
The Raw Benchmark Suit http://groups.csail.mit.edu/cag/raw/benchmark/Google Scholar
The JackCC Compiler, http://jackcc.sourceforge.netGoogle Scholar

Index Terms

Fine grain thread scheduling on multicore processors: cores with multiple functional units
1. Software and its engineering
  1. Software notations and tools
    1. Compilers
      1. Runtime environments
      2. Source code generation

Recommendations

Register allocation for fine grain threads on multicore processor

A multicore processor has multiple processing cores on the same chip. Unicore and multicore processors are architecturally different. Since individual instructions are needed to be scheduled onto one of the available cores, it effectively decreases the ...
Read More
Improving execution unit occupancy on SMT-based processors through hardware-aware thread scheduling

Modern processor architectures are increasingly complex and heterogeneous, often requiring software solutions tailored to the specific hardware characteristics of each processor model. In this article, we address this problem by targeting two processors ...
Read More
Boosting single-thread performance in multi-core systems through fine-grain multi-threading
ISCA '09: Proceedings of the 36th annual international symposium on Computer architecture

Industry has shifted towards multi-core designs as we have hit the memory and power walls. However, single thread performance remains of paramount importance since some applications have limited thread-level parallelism (TLP), and even a small part with ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
Compute '13: Proceedings of the 6th ACM India Computing Convention
August 2013
196 pages
ISBN:9781450325455
DOI:10.1145/2522548
General Chairs:
R. K. Shyamasundar
TIFR, Mumbai
,
Lokendra Shastri
Infosys Labs, Infosys Ltd
,
Program Chairs:
D Janakiram
IIT Chennai
,
Srinivas Padmanabhuni
Infosys Labs
Copyright © 2013 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 22 August 2013
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
control flow graph
instruction level parallelism
multicore
static single assignment (SSA)
Qualifiers
- research-article
Conference

Acceptance Rates
Compute '13 Paper Acceptance Rate24of96submissions,25%Overall Acceptance Rate114of622submissions,18%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 90
  Total Downloads
- Downloads (Last 12 months)1
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Fine grain thread scheduling on multicore processors: cores with multiple functional units

Compute '13: Proceedings of the 6th ACM India Computing Convention

ABSTRACT

References

Cited By

Index Terms

Recommendations

Register allocation for fine grain threads on multicore processor

Improving execution unit occupancy on SMT-based processors through hardware-aware thread scheduling

Boosting single-thread performance in multi-core systems through fine-grain multi-threading