Abstract:
A packet-switched 6 × 4 2-D mesh network providing 2 Tb/s of bisectional bandwidth with a per-hop latency of 4-cycles, forms the high performance communication fabric for...Show MoreMetadata
Abstract:
A packet-switched 6 × 4 2-D mesh network providing 2 Tb/s of bisectional bandwidth with a per-hop latency of 4-cycles, forms the high performance communication fabric for a Single-Chip Cloud Computer (SCC) with 48 Pentium™ class IA-32 cores. The fabric operates on an independent power supply and frequency domain. The router micro-architecture achieves over 90% network utilization by effective use of a single-cycle Wrapped Wave-Front Allocator (WWFA) and virtual channel (VC) flow control. A router transit latency of 2 ns is achieved through early buffer write, route pre-computation and a single-cycle WWFA implementation. This 640 K transistor, 1.32 mm2 router operates at 2 GHz at 1.1 V while dissipating 550 mW. The 24-node mesh network with 1.28 Tb/s router and 16B, 5.4 mm wide links consumes only 5% of the chip area, 1.2% of the transistors and 10% of total chip power at 1.1 V in a 45 nm nine-metal CMOS process. The router energy efficiency scales from 1.3 Tb/s/W to 7.2 Tb/s/W over a dynamic voltage range from 0.7 V to 1.25 V.
Published in: IEEE Journal of Solid-State Circuits ( Volume: 46, Issue: 4, April 2011)