Abstract:
Shared memory is the predominant programming model in today's MPSoCs. However, existing SoC on-chip communication standards like AMBA relies on the interconnect for order...Show MoreMetadata
Abstract:
Shared memory is the predominant programming model in today's MPSoCs. However, existing SoC on-chip communication standards like AMBA relies on the interconnect for ordering. This is a problem as the number of actors increases, as traditional simple interconnects like buses and crossbars do not scale, yet scalable distributed NoCs are inherently unordered. Without built-in ordering capability from NoC, cache coherence protocols have to rely on external ordering points which can forward the requests so that every cache observes the requests in the same order. Such ordering points incur significant scalability issues though, such as indirection latency or communication hotspots in the network. In this paper, we propose a universal ordered NoC platform for shared-memory MPSoC designs to provide coherence request ordering in addition to communication. The proposed solution is based on a separate light-weight ordering network to establish the global request order which the receiving NIC leverages for delivering requests. The proposed solution provides a comprehensive support for general network topologies and various levels of memory consistency, while adhering to existing cache coherence protocol standards. The full-system simulation with heterogeneous MPSoC Rodinia benchmarks shows that it reduces the request latency by 37.6% and 35.7% over ordering points in 2D-mesh and butterfly fat tree topologies, respectively. This translates to overall runtime improvements of 17.8% and 12.0% in each topology, for a 36-node and 32-node MPSoC respectively.
Date of Conference: 02-06 November 2015
Date Added to IEEE Xplore: 07 January 2016
ISBN Information: