research-article

The Stratix™ 10 Highly Pipelined FPGA Architecture

Authors:

Jeffrey Chromczak,

David Galloway,

Valavan Manohararajah,

Tim Vanderhoek,

John Van DykenAuthors Info & Claims

FPGA '16: Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

Pages 159 - 168

https://doi.org/10.1145/2847263.2847267

Published: 21 February 2016 Publication History

Abstract

This paper describes architectural enhancements in the Altera Stratix? 10 HyperFlex? FPGA architecture, fabricated in the Intel 14nm FinFET process. Stratix 10 includes ubiquitous flip-flops in the routing to enable a high degree of pipelining. In contrast to the earlier architectural exploration of pipelining in pass-transistor based architectures, the direct drive routing fabric in Stratix-style FPGAs enables an extremely low-cost pipeline register. The presence of ubiquitous flip-flops simplifies circuit retiming and improves performance. The availability of predictable retiming affects all stages of the cluster, place and route flow. Ubiquitous flip-flops require a low-cost clock network with sufficient flexibility to enable pipelining of dozens of clock domains. Different cost/performance tradeoffs in a pipelined fabric and use of a 14nm process, lead to other modifications to the routing fabric and the logic element. User modification of the design enables even higher performance, averaging 2.3X faster in a small set of designs.

References

[1]

V. Betz, J. Rose, and A. Marquardt, "Architecture and CAD for Deep-Submicron FPGAs", Kluwer Academic Publishers, 1999

Digital Library

[2]

D. Singh and S. Brown, "The Case for Registered Routing Switches in Field Programmable Gate Arrays", Proc. FPGA 2001, pp. 161--169

Digital Library

[3]

D. Singh and S. Brown, "Integrated Retiming and Placement for Field Programmable Gate Arrays", Proc. FPGA 2002, pp. 67--76

Digital Library

[4]

R. Deokar and S. Sapatnekar, "A Fresh Look at Retiming via Clock Skew Optimization", Proc. DAC 1995, pp. 304--309.

Digital Library

[5]

A. Sharma, C. Ebeling, and S. Hauck, "PipeRoute: A Pipelining-Aware Router for Reconfigurable Architectures", IEEE TCAD, Mar. 2006, pp. 518--532

Digital Library

[6]

C. Ebeling, D. How, D. Lewis and H. Schmit, "Stratix? 10 High Performance Routable Clock Networks", Proc. FPGA 2016

Digital Library

[7]

W. Tsu et al, "HSRA: High-Speed, Hierarchical Synchronous Reconfigurable Array", Proc. FPGA 1999, pp. 125--134

Digital Library

[8]

D. Cronquist, C. Fisher, M. Figueroa, P. Franklin, and C. Ebeling, "Architecture Design Of Reconfigurable Pipelined Datapaths", Conf. Advanced Research in VLSI, 1999, pp. 23--40

Digital Library

[9]

S. Goldstein, H. Schmit, M. Budiu, S. Cadambi, M. Moe R. Taylor, "PipeRench: A Reconfigurable Architecture and Compiler", Computer, April 2000, pp. 70--77

Digital Library

[10]

K. Eguro, "Supporting High-Performance Pipelined Computation in Commodity-Style FPGAs", PhD thesis, University of Washington, 2008

Digital Library

[11]

D. Lewis et al, "The Stratix? Routing and Logic Architecture", Proc. FPGA 2003, pp. 12--20

Digital Library

[12]

D. Lewis et al, "The Stratix-II? Logic and Routing Architecture", Proc. FPGA 2005, pp. 14--20

Digital Library

[13]

D. Lewis et al, "Architectural Enhancements in Stratix-V?", Proc. FPGA 2013, pp. 147--156

Digital Library

[14]

G. Lemieux and D. Lewis, "Circuit Design of FPGA Routing Switches", Proc. FPGA 2002, pp. 19--28

Digital Library

[15]

B. Pedersen, "Logic Circuitry with Shared Lookup Table", US Patent 7317330

[16]

C.-H. Jan et al, "A 14nm SoC Platform Technology Featuring 2nd Generation Tri-Gate Transistors, 70nm Gate Pitch, 52nm Metal Pitch, and 0.0499um2 SRAM Cells, Optimized for Low Power, High Performance and High Density SoC Products", Symp. VLSI, 2015, pp. T12-T13

[17]

N. Weaver, J. Hauser, J. Wawrzynek, "The SFRA: A Corner-Turn FPGA Architecture", Proc. FPGA 2004, pp. 3--12

Digital Library

[18]

V. Manohararajah, G. Chiu, D. Singh, and S. Brown, "Predicting Interconnect Delay for Physical Synthesis in a FPGA CAD Flow", IEEE TVLSI, Aug 2007, pp. 895--903

Digital Library

[19]

D. Singh, V. Manohararajah, and S. Brown, "Two-stage Physical Synthesis for FPGAs", CICC 2005, pp. 171--178

[20]

C. Leiserson and J. Saxe, "Optimizing Synchronous Systems", Symp. Foundations of Computer Science, 1981, pp 23--36

Digital Library

[21]

C. Soviani, O. Tardieu, and S. Edwards, "Optimizing Sequential Cycles Through Shannon Decomposition and Retiming", IEEE TCAD, Mar 2007 pp. 456--467

Digital Library

[22]

D. Lewis, B. Thomson, P. Boulton, and E. S. Lee, "Transforming Bit Serial Communication Circuits into Fast, Parallel VLSI Implementations", IEEE JSSC, April 1988, pp. 549--557

[23]

P. Pan, "Continuous Retiming: Algorithms and Applications", Proc. ICCD 1997, pp. 116--121

Digital Library

[24]

W. Feng and S. Kaptanoglu, Designing Efficient Input Interconnect Blocks for LUT Clusters Using Counting and Entropy", Proc. FPGA 2007, pp. 23--30

Digital Library

Cited By

Nikolić SNikolić S(2025)Time-Domain-Multiplexed InterconnectModern Programmable Interconnect Design10.1007/978-3-031-80629-2_9(285-311)Online publication date: 7-Mar-2025
https://doi.org/10.1007/978-3-031-80629-2_9
Nikolić SNikolić S(2025)Modeling Programmable Routing in Advanced TechnologiesModern Programmable Interconnect Design10.1007/978-3-031-80629-2_4(73-119)Online publication date: 7-Mar-2025
https://doi.org/10.1007/978-3-031-80629-2_4
Nikolić SNikolić S(2025)BackgroundModern Programmable Interconnect Design10.1007/978-3-031-80629-2_3(45-70)Online publication date: 7-Mar-2025
https://doi.org/10.1007/978-3-031-80629-2_3
Show More Cited By

Index Terms

The Stratix™ 10 Highly Pipelined FPGA Architecture
1. Hardware
  1. Integrated circuits
    1. Reconfigurable logic and FPGAs
      1. Programmable interconnect
      2. Programmable logic elements

Recommendations

The Stratix II logic and routing architecture
FPGA '05: Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field-programmable gate arrays

This paper describes the Altera Stratix II™ logic and routing architecture. This architecture features a novel adaptive logic module (ALM) that is based on a 6-LUT, but can be partitioned into two smaller LUTs to efficiently implement circuits ...
An FPGA implementation for neural networks with the FDFM processor core approach

This paper presents a field programmable gate array FPGA implementation of a three-layer perceptron using the few DSP blocks and few block RAMs FDFM approach implemented in the Xilinx Virtex-6 family FPGA. In the FDFM approach, multiple processor cores ...
Highly pipelined asynchronous FPGAs
FPGA '04: Proceedings of the 2004 ACM/SIGDA 12th international symposium on Field programmable gate arrays

We present the design of a high-performance, highly pipelined asynchronous FPGA. We describe a very fine-grain pipelined logic block and routing interconnect architecture, and show how asynchronous logic can efficiently take advantage of this large ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

FPGA '16: Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

February 2016

298 pages

ISBN:9781450338561

DOI:10.1145/2847263

General Chair:
Deming Chen
University of Illinois at Urbana-Champaign, USA
,
Program Chair:
Jonathan Greene
Microsemi, USA

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGDA: ACM Special Interest Group on Design Automation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 February 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

FPGA'16

Sponsor:

SIGDA

FPGA'16: The 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays

February 21 - 23, 2016

California, Monterey, USA

Acceptance Rates

FPGA '16 Paper Acceptance Rate 20 of 111 submissions, 18%;

Overall Acceptance Rate 125 of 627 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

56
Total Citations
View Citations
1,241
Total Downloads

Downloads (Last 12 months)72
Downloads (Last 6 weeks)17

Reflects downloads up to 08 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Nikolić SNikolić S(2025)Time-Domain-Multiplexed InterconnectModern Programmable Interconnect Design10.1007/978-3-031-80629-2_9(285-311)Online publication date: 7-Mar-2025
https://doi.org/10.1007/978-3-031-80629-2_9
Nikolić SNikolić S(2025)Modeling Programmable Routing in Advanced TechnologiesModern Programmable Interconnect Design10.1007/978-3-031-80629-2_4(73-119)Online publication date: 7-Mar-2025
https://doi.org/10.1007/978-3-031-80629-2_4
Nikolić SNikolić S(2025)BackgroundModern Programmable Interconnect Design10.1007/978-3-031-80629-2_3(45-70)Online publication date: 7-Mar-2025
https://doi.org/10.1007/978-3-031-80629-2_3
Nikolić SNikolić S(2025)Where Did the FPGAs Come from and Where Are They Headed?Modern Programmable Interconnect Design10.1007/978-3-031-80629-2_2(13-44)Online publication date: 7-Mar-2025
https://doi.org/10.1007/978-3-031-80629-2_2
Nandi PMishra ARao M(2024)GOLDS: Genetic Algorithm-based Optimization of Custom FPGA Architecture Layout Design for Secure SiliconProceedings of the Great Lakes Symposium on VLSI 202410.1145/3649476.3658743(92-97)Online publication date: 12-Jun-2024
https://dl.acm.org/doi/10.1145/3649476.3658743
Hassani FSadrosadati MRohbani NPointner SWille RSarbazi-azad H(2024)An Efficient FPGA Architecture with Turn-Restricted Switch BoxesACM Transactions on Design Automation of Electronic Systems10.1145/3643809Online publication date: 3-Feb-2024
https://doi.org/10.1145/3643809
Shi KWang L(2024)An Open-Source Tool to Model and Explore Complex Routing Architecture for FPGA2024 2nd International Symposium of Electronics Design Automation (ISEDA)10.1109/ISEDA62518.2024.10617494(734-739)Online publication date: 10-May-2024
https://doi.org/10.1109/ISEDA62518.2024.10617494
Doumet MStan MHall MBetz V(2024)H2PIPE: High Throughput CNN Inference on FPGAs with High-Bandwidth Memory2024 34th International Conference on Field-Programmable Logic and Applications (FPL)10.1109/FPL64840.2024.00019(69-77)Online publication date: 2-Sep-2024
https://doi.org/10.1109/FPL64840.2024.00019
Boutros ABetz V(2024)Field-Programmable Gate Array ArchitectureHandbook of Computer Architecture10.1007/978-981-97-9314-3_49(417-463)Online publication date: 21-Dec-2024
https://doi.org/10.1007/978-981-97-9314-3_49
Sano KKoshiba AMiyajima TUeno T(2023)ESSPER: Elastic and Scalable FPGA-Cluster System for High-Performance Reconfigurable Computing with Supercomputer FugakuProceedings of the International Conference on High Performance Computing in Asia-Pacific Region10.1145/3578178.3579341(140-150)Online publication date: 27-Feb-2023
https://dl.acm.org/doi/10.1145/3578178.3579341
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten