ABSTRACT
The proposed work discusses a global scheduling technique for multicore processors with specific focus on processor cores having multiple functional units. The design philosophy of the multicore architecture is to accommodate more cores with more execution capabilities on a chip by reducing other complex and redundant circuits. Due to the simplicity of hardware on the chip of multicore processor, the onus of detecting and exploiting the instruction level parallelism (ILP) in the program lies on the complier. Following work proposes a scheduling technique which is used to schedule the instructions onto multiple cores on chip each having multiple functional units. The goal is achieved by dissecting each basic block of the program's control flow graph (CFG) into sub-divisions called sub-blocks. These sub-blocks are then analyzed for the break-up of instructions on the basis of instruction type (Integer or Floating Point) and then they are scheduled onto different cores while trying to get a balanced trade-off between communication costs amongst the cores. The scheduler provides enough or approximately equal number of integer and floating point instructions to each core which may be executed in parallel on the core's multiple functional units (integer unit and floating point units), thus taking advantage of the core's architecture.
- John L, Hennessy, David A Patterson, Computer Architecture: A Quantitative Approach, Morgan Kaufmann, San Francisco (2011). Google ScholarDigital Library
- M. D. Hill and M. R. Marty. Amdahl's law in the multicore era. IEEE Computer, pp. 33--38, 2008. Google ScholarDigital Library
- Dong Hyuk Woo, Hsien-hsin S. Lee, Extending Amdahl's Law for Energy-Efficient Computing in the Many-Core Era, IEEE Computer, pp. 24--31, 2008. Google ScholarDigital Library
- D. C. Kiran, S. Gurunarayanan, and J. P. Misra, Taming compiler to work with multicore processors, IEEE Conference on Process Automation, Control and Computing, 2011.Google ScholarCross Ref
- D. C. Kiran, S. Gurunarayanan, and J. P. Misra, Compiler Driven Inter Block Parallelism for Multicore Processors. In 6th International Conference on Information Processing, published in the Communications in Computer and Information Science (CCIS), Springer-Verlag, August 2012.Google Scholar
- R. Cytron, J. Ferrante, B. K. Rosen, M. N. Wegman, and F. K. Zadeck. Efficient computing static single assignment form and the control dependence graph. ACM Transaction on Programming Languages and Systems, 13(4),pp.451--490,1991. Google ScholarDigital Library
- D. C. Kiran, B. Radheshyam, Gurunarayanan, and J. P. Misra, Compiler assisted dynamic scheduling for multicore processors, IEEE Conference on Process Automation, Control and Computing, 2011.Google ScholarCross Ref
- D. C. Kiran, S. Gurunarayanan, Faizan Khaliq, and Abhijeet Nawal, Compiler Efficient and Power Aware Instruction Level Parallelism for Multicore Architectures. In The International Eco-friendly Computing and Communication Systems, published in the Communications in Computer and Information Science (CCIS), Springer-Verlag, pp.9--17 August 2012.Google Scholar
- Fisher, J. A. The VLIW Machine: A Multiprocessor for Compiling Scientific Code, Computer, vol.17, no.7, pp.45--53, July 1984. Google ScholarDigital Library
- J. Babb, M. Frank, V. Lee, E. Waingold, R. Barua, M. Taylor J. Kim, S. Devabhaktuni, A. Agarwal, The RAW benchmark suite: computation structures for general purpose computing, Proceedings of the 5th IEEE Symposium on FPGA-Based Custom Computing Machines, pp.134, 1997. Google ScholarDigital Library
- The Raw Benchmark Suit http://groups.csail.mit.edu/cag/raw/benchmark/Google Scholar
- The JackCC Compiler, http://jackcc.sourceforge.netGoogle Scholar
Index Terms
- Fine grain thread scheduling on multicore processors: cores with multiple functional units
Recommendations
Register allocation for fine grain threads on multicore processor
A multicore processor has multiple processing cores on the same chip. Unicore and multicore processors are architecturally different. Since individual instructions are needed to be scheduled onto one of the available cores, it effectively decreases the ...
Improving execution unit occupancy on SMT-based processors through hardware-aware thread scheduling
Modern processor architectures are increasingly complex and heterogeneous, often requiring software solutions tailored to the specific hardware characteristics of each processor model. In this article, we address this problem by targeting two processors ...
Boosting single-thread performance in multi-core systems through fine-grain multi-threading
ISCA '09: Proceedings of the 36th annual international symposium on Computer architectureIndustry has shifted towards multi-core designs as we have hit the memory and power walls. However, single thread performance remains of paramount importance since some applications have limited thread-level parallelism (TLP), and even a small part with ...
Comments