The Potential of On-Chip Multiprocessing for QCD Machines

Bilardi, Gianfranco; Pietracaprina, Andrea; Pucci, Geppino; Schifano, Fabio; Tripiccione, Raffaele

doi:10.1007/11602569_41

Gianfranco Bilardi²⁰,
Andrea Pietracaprina²⁰,
Geppino Pucci²⁰,
Fabio Schifano²¹ &
…
Raffaele Tripiccione²¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3769))

Included in the following conference series:

International Conference on High-Performance Computing

638 Accesses
16 Citations

Abstract

We explore the opportunities offered by current and forthcoming VLSI technologies to on-chip multiprocessing for Quantum Chromo Dynamics (QCD), a computational grand challenge for which over half a dozen specialized machines have been developed over the last two decades. Based on a careful study of the information exchange requirements of QCD both across the network and within the memory system, we derive the optimal partition of die area between storage and functional units. We show that a scalable chip organization holds the promise to deliver from hundreds to thousands flop per cycle as VLSI feature size scales down from 90 nm to 20 nm, over the next dozen years.

This research was supported in part by MIUR of Italy under project “ALGO-NEXT: ALGOrithms for the NEXT generation Internet and the Web”, and by the University of Padova under Grant CPDA033838.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Qurzon: A Prototype for a Divide and Conquer-Based Quantum Compiler for Distributed Quantum Systems

Article 10 June 2022

Quantum phases of matter on a 256-atom programmable quantum simulator

Article 07 July 2021

A programmable qudit-based quantum processor

Article Open access 04 March 2022

References

Abelson, H., Andreae, P.: Information transfer and area-time tradeoffs for VLSI multiplication. Communications of the ACM 23(1), 20–23 (1980)
Article MATH MathSciNet Google Scholar
Aggarwal, A., Chandra, A.K., Snir, M.: Hierarchical memory with block transfer. In: Proc. of the 28th IEEE Symp. on Foundations of Computer Science, pp. 204–216 (1987)
Google Scholar
Aggarwal, A., Vitter, J.S.: The input/output complexity of sorting and related problems. Communications of the ACM 31(9), 1116–1127 (1988)
Article MathSciNet Google Scholar
Albanese, M., et al.: The APE Computer: an Array Processor Optimized for Lattice gauge Theory Simulations. Comput. Phys. Commun. 45, 345 (1987)
Article Google Scholar
Allen, F., et al.: Blue Gene: a vision for protein science using a petaflop supercomputer. IBM Systems Journal 40(2), 310–327 (2001)
Article Google Scholar
Almasi, G., et al.: Design and implementation of message passing services for the Blue Gene/L supercomputer. IBM J. Res. Develop. 49(2/3) (2005)
Google Scholar
Alpern, B., Carter, L., Feig, E., Selker, T.: The uniform memory hierarchy model of computation. Algorithmica 12(2/3), 72–109 (1994)
Article MATH MathSciNet Google Scholar
Battista, C., et al.: The APE-100 Computer: (I) the Architecture. Int. J. High Speed Computing 5, 637 (1993)
Article Google Scholar
Beetem, J., Denneau, M., Weingarten, D.: The GF11 supercomputer. In: Proc.of 12th Int. Symposium on Computer Architecture, pp. 108–115 (1985)
Google Scholar
Bilardi, G., Pietracaprina, A., D’Alberto, P.: On the space and access complexity of computation dags. In: Brandes, U., Wagner, D. (eds.) WG 2000. LNCS, vol. 1928, pp. 47–58. Springer, Heidelberg (2000)
Chapter Google Scholar
Bilardi, G., Preparata, F.P.: Area-time lower-bound techniques with application to sorting. Algorithmica 1(1), 65–91 (1986)
Article MATH MathSciNet Google Scholar
Bilardi, G., Preparata, F.P.: Processor-time tradeoffs under bounded-speed message propagation: Part II, lower bounds. Theory of Computing Systems 32, 531–559 (1999)
Article MATH MathSciNet Google Scholar
Bilardi, G., Sarrafzadeh, M.: Optimal VLSI circuits for the discrete Fourier transform. In: Advances in Computing Research, vol. 4, pp. 87–101. JAI Press, Greenwich (1987)
Google Scholar
Brent, R.P., Kung, H.T.: The chip complexity of binary arithmetic. J. Ass. Comp. Mach. 28(3), 521–534 (1981)
MATH MathSciNet Google Scholar
Chen, D., et al.: QCDOC: A 10-teraflops scale computer for lattice QCD. In: Proc. of 18th Intl. Symposium on Lattice Field Theory (Lattice 2000), Bangalore, India (August 2000)
Google Scholar
ClearSpeed Site, http://www.clearspeed.com
Clouser, J., et al.: A 600-MHz superscalar floating-point processor. IEEE Journal on Solid-State Circuits 34(7), 1026–1029 (1999)
Article Google Scholar
Culler, D.E., Singh, J.P., Gupta, A.: Parallel Computer Architecture: A Hardware/Software Approach. Morgan Kaufmann, San Mateo (1999)
Google Scholar
Cypher, R.: Theoretical aspects of VLSI PIN limitations. SIAM J. Comput. 2(2), 356–378 (1993)
Article MathSciNet Google Scholar
Fantozzi, C., Pietracaprina, A., Pucci, G.: Seamless integration of parallelism and memory hierarchy. In: Widmayer, P., Triguero, F., Morales, R., Hennessy, M., Eidenbenz, S., Conejo, R. (eds.) ICALP 2002. LNCS, vol. 2380, pp. 856–867. Springer, Heidelberg (2002)
Chapter Google Scholar
Hong, J.W., Kung, H.T.: I/O complexity: The red-blue pebble game. In: Proc. of the 13th ACM Symp. on Theory of Computing, pp. 326–333 (1981)
Google Scholar
Intel Itanium2 Site, http://www.intel.com/products/processor/itanium2/
Iwasaki, Y.: Computers for lattice field theories. Nuclear Physics (Proc. Suppl.) 34, 78 (1994)
Article MathSciNet Google Scholar
Kahle, J., Suzuoki, M., Masubuchi, Y.: Cell Microprocessor, Briefing, San Francisco (February 7, 2005)
Google Scholar
Leighton, F.T.: Introduction to Parallel Algorithms and Architectures: Arrays ∙ Trees ∙ Hypercubes. Morgan Kaufmann, San Mateo (1992)
MATH Google Scholar
Mueller, S., et al.: The vector floating-point unit in a synergistic processor element of a Cell processor. In: Proc. 17th IEEE Int. Symp. on Computer Arithmetic (June 2005) (To Appear)
Google Scholar
Mawhinney, R.D.: The 1 Teraflops QCDSP Computer. Parallel Computing 25(10-11), 1281–1296 (1999)
Article MATH Google Scholar
Parallel Computing, 25(10–11), Special Issue on High Performance Computing in LQCD (1999)
Google Scholar
Snir, M.: I/O Limitations on multi-chip VLSI systems. In: Proc. 19th Allerton Conference on Communications, Control, and Computing, Monticello, IL, pp. 224–233 (1981)
Google Scholar
Sze, S.M. (ed.): VLSI Technology, 2nd edn. McGraw-Hill, New York (1988)
Google Scholar
Thompson, C.D.: A complexity theory for VLSI. PhD thesis, Dept. of Computer Science, Carnegie-Mellon University, Tech. Rep. CMU-CS-80-140 (August 1980)
Google Scholar
The Top 500 Supercomputer Sites, http://www.top500.org
Tripiccione, R.: APEmille. Parallel Computing 25(10-11), 1297–1309 (1999)
Article MATH Google Scholar
Tripiccione, R.: LGT simulations on APEmachines. Computer Physics Communications 139, 55 (2001)
Article MATH Google Scholar
Tripiccione, R.: Strategies for dedicated computing for lattice gauge theories. Computer Physics Communications 169, 442–448 (2005)
Article MATH Google Scholar
TRIPS: Tera-op Reliable Intelligently adaptive Processing System, http://www.cs.utexas.edu/users/cart/trips/
Ullman, J.D.: Computational Aspects of VLSI. Computer Science Press, Rockville MD (1984)
MATH Google Scholar
Yao, A.C.C.: Some complexity questions related to distributive computing. Proc. of the 11th ACM Symp. on Theory of Comp., 209–213 (1979)
Google Scholar

Download references

Author information

Authors and Affiliations

Dipartimento di Ingegneria dell’Informazione, Università di Padova, Via Gradenigo 6/B, 35131, Padova, Italy
Gianfranco Bilardi, Andrea Pietracaprina & Geppino Pucci
Dipartimento di Fisica, Università di Ferrara, and INFN, Via del Paradiso 12, 44100, Ferrara, Italy
Fabio Schifano & Raffaele Tripiccione

Authors

Gianfranco Bilardi
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Pietracaprina
View author publications
You can also search for this author in PubMed Google Scholar
Geppino Pucci
View author publications
You can also search for this author in PubMed Google Scholar
Fabio Schifano
View author publications
You can also search for this author in PubMed Google Scholar
Raffaele Tripiccione
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

College of Computing, Georgia Institute of Technology, 30332, Atlanta, GA, USA
David A. Bader
Department of Electrical and Computer Engineering, Rutgers, the State University of New Jersey, 94 Brett Road, 08854, Piscataway, NJ, USA
Manish Parashar
Satyam Computer Services Ltd., Indian Institute of Science Campus, Entrepreneurship Centre, SID Block, 560 012, Bangalore, India
Varadarajan Sridhar
Department of Electrical Engineering, University of Southern California, 90089-2562, Los Angeles, CA, USA
Viktor K. Prasanna

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bilardi, G., Pietracaprina, A., Pucci, G., Schifano, F., Tripiccione, R. (2005). The Potential of On-Chip Multiprocessing for QCD Machines . In: Bader, D.A., Parashar, M., Sridhar, V., Prasanna, V.K. (eds) High Performance Computing – HiPC 2005. HiPC 2005. Lecture Notes in Computer Science, vol 3769. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11602569_41

Download citation

DOI: https://doi.org/10.1007/11602569_41
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30936-9
Online ISBN: 978-3-540-32427-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

The Potential of On-Chip Multiprocessing for QCD Machines

Abstract

Access this chapter

Preview

Similar content being viewed by others

Qurzon: A Prototype for a Divide and Conquer-Based Quantum Compiler for Distributed Quantum Systems

Quantum phases of matter on a 256-atom programmable quantum simulator

A programmable qudit-based quantum processor

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

The Potential of On-Chip Multiprocessing for QCD Machines

Abstract

Access this chapter

Preview

Similar content being viewed by others

Qurzon: A Prototype for a Divide and Conquer-Based Quantum Compiler for Distributed Quantum Systems

Quantum phases of matter on a 256-atom programmable quantum simulator

A programmable qudit-based quantum processor

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation