Dynamic Code Partitioning for Clustered Architectures

Canal, Ramon; Parcerisa, Joan-Manuel; González, Antonio

doi:10.1023/A:1026483904675

Dynamic Code Partitioning for Clustered Architectures

Published: February 2001

Volume 29, pages 59–79, (2001)
Cite this article

International Journal of Parallel Programming Aims and scope Submit manuscript

Ramon Canal¹,
Joan-Manuel Parcerisa¹ &
Antonio González¹

57 Accesses
6 Citations
Explore all metrics

Abstract

Recent works¹ show that delays introduced in the issue and bypass logic will become critical for wide issue superscalar processors. One of the proposed solutions is clustering the processor core. Clustered architectures benefit from a less complex partitioned processor core and thus, incur in less critical delays. In this paper, we propose a dynamic instruction steering logic for these clustered architectures that decides at decode time the cluster where each instruction is executed. The performance of clustered architectures depends on the inter-cluster communication overhead and the workload balance. We present a scheme that uses runtime information to optimize the trade-off between these figures. The evaluation shows that this scheme can achieve an average speed-up of 35% over a conventional 8-way issue (4 int + 4 fp) machine and that it outperforms other previous proposals, either static or dynamic.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

UCIFF: Unified Cluster Assignment Instruction Scheduling and Fast Frequency Selection for Heterogeneous Clustered VLIW Cores

Instruction Scheduling in Microprocessors

Improvements of Instruction Scheduling

REFERENCES

S. Palacharla, N. P. Jouppi, and J. E. Smith, Complexity-effective superscalar processors, Proc. 24th Int'l. Symp. on Comp. Architecture, pp. 1-13 (June 1997).
S. Palacharla and J. E. Smith, Decoupling integer execution in superscalar processors, Proc. 28th Ann. Symp. on Microarchitecture, pp. 285-290 (November 1995).
S. S. Sastry, S. Palacharla, and J. E. Smith, Exploiting idle floating-point resources for integer execution, Proc. Int'l. Conf. Progr. Lang. Design and Implementation, pp. 118-129 (June 1998).
K. I. Farkas, P. Chow, N. P. Jouppi, and Z. Vranesic, The multicluster architecture: Reducing cycle time through partitioning, Proc. 30th Ann. Symp. on Microarchitecture, pp. 149-159 (December 1997).
G. A. Kemp and M. Franklin, PEWs: A decentralized dynamic scheduler for ILP process-ing, Proc. of the Int'l. Conf. on Parallel Processing, pp. 239-246 (August 1996).
L. Gwennap, Digital 21264 sets new standard, Microprocessor Report 10(14):11-16 (October 1996).
Google Scholar
D. Burger, T. M. Austin, and S. Bennett, Evaluating future microprocessors: The Simple-Scalar tool set, Technical Report CS-TR-96-1308, University of Wisconsin-Madison (1996).
Standard Performance Evaluation Corporation, SPEC Newsletter (September 1995).
C. Lee, M. Potkonjak, and W. H. Mangione-Smith, Mediabench: A tool for evaluating and synthesizing multimedia and communications systems, Proc. IEEE-ACM Int'l. Symp. on Microarchitecture (MICRO 30), pp. 330-335 (December 1997).
D. Matzke, Will physical scalability sabotage performance gains, IEEE Computer 30(9): 37-39 (September 1997).
Google Scholar
R. Canal, J. M. Parcerisa, and A. Gonzalez, Dynamic cluster assignment mechanisms, Proc. Sixth Int'l. Symp. on High Performance Comp. Arch., pp. 133-142 (January 2000).
K. I. Farkas, Memory-System Design Considerations for Dynamically-Scheduled Micro-processors, Ph.D. thesis, Department of Electrical and Computer Engineering, University of Toronto, Canada (January 1997).
Google Scholar
J. E. Smith, Decoupled acces-execute computer architectures, ACM Trans. Computer Syst. 2(4):289-308 (November 1984).
Google Scholar
M. Franklin, The multiscalar architecture, Ph.D. thesis, Technical Report TR 1196, Computer Sciences Department, University of Wisconsin-Madison (1993).
G. S. Sohi, S. E. Breach, and T. N. Vijaykumar, Multiscalar processors, Proc. 22nd Int'l. Symp. on Computer Architecture, pp. 414-425 (June 1995).
E. Rotenberg, Q. Jacobson, Y. Sazeides, and J. E. Smith, Trace processors, Proc. 30th Ann. Symp. on Microarchitectuer, pp. 138-148 (December 1997).
S. Vajapeyam and T. Mitra, Improving superscalar instruction dispatch and issue by exploiting dynamic code sequences, Proc. Int'l. Symp. on Computer Architecture, pp. 1-12 (June 1997).
P. Marcuello and A. González, Clustered speculative multithreaded processors, Proc. 13th ACM Int'l. Conf. on Supercomputing, pp. 365-372 (June 1999).
M. M. Fernandes, J. Llosa and N. Topham, Distributed modulo scheduling,Proc. Fifth Int'l. Symp. on High Performance Computer Architecture, pp. 130-134 (January 1999).
E. Nystrom and A. E. Eichenberger, Effective cluster assignment for modulo scheduling, Proc. 31st Ann. Symp. on Microarchitecture, pp. 103-114 (1998).
L. Gwennap, Intel's MMX speeds multimedia instructions, Microprocessor Report 10(3):1 (March 1996).
Google Scholar

Download references

Author information

Authors and Affiliations

Department d'Arquitectura de Computadors, Universitat Politècnica de Catalunya, Cr. Jordi Girona, 1–3 Mòdul D6, 08034, Barcelona, Spain
Ramon Canal, Joan-Manuel Parcerisa & Antonio González

Authors

Ramon Canal
View author publications
You can also search for this author in PubMed Google Scholar
Joan-Manuel Parcerisa
View author publications
You can also search for this author in PubMed Google Scholar
Antonio González
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ramon Canal.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Canal, R., Parcerisa, JM. & González, A. Dynamic Code Partitioning for Clustered Architectures. International Journal of Parallel Programming 29, 59–79 (2001). https://doi.org/10.1023/A:1026483904675

Download citation

Issue Date: February 2001
DOI: https://doi.org/10.1023/A:1026483904675

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dynamic Code Partitioning for Clustered Architectures

Abstract

Access this article

Similar content being viewed by others

UCIFF: Unified Cluster Assignment Instruction Scheduling and Fast Frequency Selection for Heterogeneous Clustered VLIW Cores

Instruction Scheduling in Microprocessors

Improvements of Instruction Scheduling

REFERENCES

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Navigation

Dynamic Code Partitioning for Clustered Architectures

Abstract

Access this article

Similar content being viewed by others

UCIFF: Unified Cluster Assignment Instruction Scheduling and Fast Frequency Selection for Heterogeneous Clustered VLIW Cores

Instruction Scheduling in Microprocessors

Improvements of Instruction Scheduling

REFERENCES

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation