Abstract
Effective utilization of the available processing resources in current multi- and manycore systems primarily depends on the manual talent of the application programmer. This chapter analyses opportunities and suggests approaches to tackle the challenge of making proper use of parallel resources by means of a holistic, cross-layer and inter-disciplinary optimization of application, middleware and architecture aspects. Using heterogeneous network processors as an example, we show how application specific architecture optimizations in this processor domain can be adapted to benefit designs of homogeneous general purpose manycore systems. In addition, methods which have been applied successfully to HPC and scientific computing over the past decades are assessed and down-scaled to benefit manycores. Finally we show how bio-inspired principles (i.e., self-organization and self-adaptation) provide rich opportunities for meaningful adoption in both application-specific and general purpose manycores, for example to provide self-optimization of processor parameters and workload utilization. In summary, we present a set of suggestions for architectural improvements and building blocks that, from our perspective, are useful for future manycores in order to better support the exploitation of available parallel processing resources.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Tilera. http://www.tilera.com/.
- 2.
Intel, Single Chip Cloud Computer. http://techresearch.intel.com/articles/Tera-Scale/1826.htm.
- 3.
The AMD Opteron 6000 Series Platform. http://www.amd.com/us/products/server/processors/6000-series-platform/.
- 4.
Intel Microarchitecture Codename Nehalem. http://www.intel.com/technology/architecture-silicon/next-gen/.
- 5.
Texas Instruments, OMAP platform. http://www.ti.com/OMAP_DSPs.
- 6.
The Cell project at IBM Research. http://www.research.ibm.com/cell/.
- 7.
ClearSpeed CSX700. http://www.clearspeed.com/products/csx700.php.
- 8.
Xelerated. Xelerator X11 Network Processors. http://www.xelerated.com/uploads/files/5.pdf.
- 9.
IDT. Network Search Engines. Product Flyer. http://www.idt.com/products/getDoc.cfm?docID=10154.
- 10.
Agilent. Mixed Packet Size Throughput. http://advanced.comms.agilent.com/n2x/docs/insight/2001-08/TestingTips/1MxdPktSzThroughput.pdf.
- 11.
Tilera.http://www.tilera.com/.
References
N.R.Adiga et al. An overview of the BlueGene/L Supercomputer. In Supercomputing ’02: Proceedings of the 2002 ACM/IEEE conference on Supercomputing, pages 1–22, Los Alamitos, CA, USA, 2002. IEEE Computer Society Press
T.W. Ainsworth and T.M. Pinkston. On Characterizing Performance of the Cell Broadband Engine Element Interconnect Bus. Networks-on-Chip, 2007. First International Symposium on NOCS 2007, pages 18–29, 7–9 May 2007
F. Baker, Cisco Systems. Requirements for IP version 4 routers, IETF RFC 1812. http://tools.ietf.org/html/rfc1812, 1995
S. Borkar, T. Karnik, S. Narendra, J. Tschanz, A. Keshavarzi, and V. De. Parameter variations and impact on circuits and microarchitecture. pages 338–342, 2003
D. Burger, S.W. Keckler, K.S. McKinley, M. Dahlin, L.K. John, C. Lin, C.R. Moore, J. Burrill, R.G. McDonald, W. Yoder, et al. Scaling to the End of Silicon with EDGE Architectures. Computer, pages 44–55, 2004
J. Fromm. Emergence of Complexity. Kassel University Press, Kassel, 2004
Y. Hoskote, S. Vangal, A. Singh, N. Borkar, and S. Borkar. A 5-GHz Mesh Interconnect for a Teraflops Processor. IEEE Micro, pages 51–61, 2007
Y. Inada and K. Kawachi. Order and Flexibility in the Motion of Fish Schools. Journal of Theoretical Biology, pages 371–387, 2002
C. Jesshope, M. Lankamp, and L. Zhang. Evaluating CMPs and Their Memory Architecture. In M. Berekovic, C. Muller-Schoer, C. Hochberger, and S. Wong, editors, Proc. Architecture of Computing Systems, pages 246–257, 2009
L. Kencl. Load Sharing for Multiprocessor Network Nodes. Dissertation, EPFL, Lausanne, Switzerland, 2003
S. Kent et al., BBN Technologies. Security Architecture for the Internet Protocol, IETF RFC 4301. http://tools.ietf.org/html/rfc4301, 2005
S. Kumar, C.J. Hughes, and A. Nguyen. Carbon: Architectural Support For Fine-Grained Parallelism On Chip Multiprocessors. In ISCA ’07: Proceedings of the 34th annual international symposium on Computer architecture, pages 162–173, NY, USA, 2007. ACM, NY
A. Lankes, A. Herkersdorf, S. Sonntag, and H. Reinig. NoC Topology Exploration for Mobile Multimedia Applications. In 16th IEEE International Conference on Electronics, Circuits and Systems, Dec 2009
A. Lankes, T. Wild, and A. Herkersdorf. Hierarchical NoCs for Optimized Access to Shared Memory and IO Resources. Euromicro Symposium on Digital Systems Design, pages 255–262, 2009
M. Meitinger, R. Ohlendorf, T. Wild, and A. Herkersdorf. Application Scenarios for FlexPath NP. Technical Report TUM-LIS-TR-0501. Technische Universität München. Lehrstuhl für Integrierte Systeme, 2005
M. Meitinger, R. Ohlendorf, T. Wild, and A. Herkersdorf. FlexPath NP – A Network Processor Architecture with Flexible Processing Paths. SoC 2008, Tampere, Finland, Nov 2008
G. De Micheli. Robust System Design With Uncertain Information. In The Asia and South Pacific Design Automation Conference (ASP-DAC ’03) Keynote Speech, Kitakyushu, page 12, 2003
C. Müller-Schloer. Organic Computing: On The Feasibility Of Controlled Emergence. In CODES+ISSS ’04: Proceedings of the 2nd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis, pages 2–5, NY, USA, 2004. ACM, NY
R. Ohlendorf, A. Herkersdorf, and T. Wild. FlexPath NP – A Network Processor Concept with Application-Driven Flexible Processing Paths. CODES+ISSS 2005, Jersey City, NJ, USA, Sept 2005
R. Ohlendorf, M. Meitinger, T. Wild, and A. Herkersdorf. An Application-aware Load Balancing Strategy for Network Processors. HiPEAC 2010, Pisa, Italy, Jan 2010
P. Palatin, Y. Lhuillier, and O. Temam. CAPSULE: Hardware-Assisted Parallel Execution of Component-Based Programs. In Proc. ACM International Symposium on MICRO-39 Microarchitecture 39th Annual IEEE, pages 247–258, 2006
W. Shi and L. Kencl. Sequence-Preserving Adaptive Load Balancers. ANCS 2006, San Jose, CA, USA, Dec 2006
J. Teich. Invasive Algorithms and Architectures. it – Information Technology, pages 300–310, 2008
D. Wentzlaff, P. Griffin, H. Hoffmann, L. Bao, B. Edwards, C. Ramey, M. Mattina, C.-C. Miao, J.F. Brown III, and A. Agarwal. On-Chip Interconnection Architecture Of The Tile Processor. IEEE Micro, pages 15–31, 2007
J. Zeppenfeld and A. Herkersdorf. Autonomic Workload Management for Multi-Core Processor Systems. In International Conference on Architecture of Computing Systems, 2010
J. Zeppenfeld, A. Bouajila, W. Stechele, and A. Herkersdorf. Learning Classifier Tables for Autonomic Systems on Chip. In GI Jahrestagung, pages 771–778, 2008
Acknowledgements
Particular thanks go to the German Research Foundation (DFG), the State of Bavaria and Infineon Technologies for supporting our work as part of the Priority Programmes “1148: Reconfigurable Computing” and “1183: Organic Computing”, the “Munich Centre for Advanced Computing” (Project B4, MAPCO) and the BMBF Collaborative industry project “RapidMPSoC” (grant BMBF 01M3085).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Herkersdorf, A. et al. (2011). Hardware Support for Efficient Resource Utilization in Manycore Processor Systems. In: Hübner, M., Becker, J. (eds) Multiprocessor System-on-Chip. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-6460-1_3
Download citation
DOI: https://doi.org/10.1007/978-1-4419-6460-1_3
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4419-6459-5
Online ISBN: 978-1-4419-6460-1
eBook Packages: EngineeringEngineering (R0)