Elsevier

Integration

Volume 58, June 2017, Pages 91-100
Integration

On supporting rapid prototyping of embedded systems with reconfigurable architectures

https://doi.org/10.1016/j.vlsi.2017.02.007Get rights and content

Highlights

  • We present a modern virtual prototyping platform and identify some of the most time-consuming steps in development.

  • We define and extract necessary task information through profiling, high-level synthesis and task execution on an FPGA.

  • We model task mapping as an optimisation problem and solve it with genetic algorithms.

  • We optimise the hardware-assigned tasks assigned to hardware by performing further DSE with the help of FPGA.

Abstract

Reducing time-to-market while improving product quality is a big challenge. This paper proposes a software-supported framework for rapid prototyping that offers a concurrent fast hardware/software system-level design. The introduced framework enables the constant evaluation and verification of the prototype under development, while it provides automatic functionality mapping to hardware via High-Level Synthesis techniques. We evaluate our framework and its software instantiation with a computer vision algorithm. Based on our experimentation, we show that our approach reduces the development time by almost 64×, it prunes the hardware design space by 34×, while maintaining designs that trade-off high Quality-of-Report on the Pareto frontier.

Introduction

Designing full system solutions is a complex task. With vastly increased complexity and functionality especially in the nanometer era, where hundreds of millions of transistors are integrated on a single chip, the design of complex Integrated Circuits (ICs) has become a challenging task. In addition to that, the continuously increased demand for even higher performance, i.e. in terms of operation frequency and power consumption, imposes that new design techniques are absolutely required.

This problem becomes far more important if we take into consideration that software aspects of ICs can account for 80%, or more, of embedded systems development cost, making the conventional way for product development insufficient. For instance, the International Technology Roadmap for Semiconductors (ITRS) [1] predicts that software development costs will increase and will reach rough parity with hardware costs, even with the advent of multi-core software development tools.

Electronic Design Automation (EDA) tools are crucial nowadays for deriving optimal solutions. Existing working flows are built on the fundamental premise that models are fully interchangeable and interoperable among different EDA vendors for the whole physical prototyping process, as in architectural analysis, simulation and synthesis. Even though this concept seems straightforward and promising, it has been proven completely elusive in the world of Electronic System Level: existing solutions do not provide either model interoperability, neither independence between model and software tools. As such, it is often desired to reach the highest possible systemic level of the target application description in order to avoid a possible vendor lock-in.

Apart from the technology-oriented parameters that affect the efficiency and/or the flexibility of a digital system, the tight time-to-market requirements make conventional ways for product development, e.g. start software development after finalising hardware, to lead usually in missed market windows and revenue opportunities. Hence, there is an absolute requirement for software developers to get an early start on their work, long before the Register-Transfer Level of the hardware is finalised.

Towards this direction, and as research pushes for better programming models for multi-processor and multi-core embedded systems, Virtual Platforms (VPs) solve one of today's biggest challenges in physical design: to enable sufficient software development, debug and validation before the hardware device becomes available. More specifically, with the virtualization feature, it is possible to model a hardware platform consisted of different processing cores, memories, peripherals, as well as interconnection schemes, in the form of a simulator. Furthermore, as the task of hardware development progressively proceeds, it is feasible to redistribute to software teams updated versions of the VP, that enable a gradually better description of the target architecture.

The concept of virtualization is also important for hardware architects, as it enables easier verification of Intellectual Properties (IP) kernels. This feature could be employed both in the case where only a few of the application kernels have to be developed in hardware, as well as if incremental system prototyping is performed. In both cases, the virtualization feature provides all the necessary mechanisms for performing co-simulation and verification between the IPs developed in Register Transfer Level (RTL) and the rest application functionalities executed onto the VP.

In this paper we identify common pitfalls during virtual prototyping for hardware/software co-design and propose a software-supported methodology to perform rapid system-level prototyping of complex digital systems. More specifically we:

  • Present a modern virtual prototyping platform in Section 2 and explain some of the most time-consuming steps in development.

  • Define and extract necessary task information through profiling, high-level synthesis and task execution on an FPGA (Section 3.1).

  • Model task mapping as an optimisation problem and solve it with genetic algorithms (Subsection section 3.2).

  • Optimise the hardware-assigned tasks assigned to hardware by performing further DSE with the help of FPGA (Subsection section 3.3).

  • Evaluate our proposed working flow on the Harris & Stephens Corner Detection Algorithm from Computer Vision (Section 4). From a full-software solution we reach to an optimal mixed (hardware/software) one 6 times faster than a conventional approach to virtual prototyping (Section 5).

  • Mention other relevant techniques that can be used for task partitioning, as well as other prototyping frameworks, and explain where and why our proposed toolflow works better (Section 6).

Our conclusion is that rapid prototyping of multi-million gate systems is achievable. We can have prototypes of our system on-the-go, as we modify, add, or optimise the algorithms that describe the system tasks.

Section snippets

Background: virtual prototyping

Fig. 1 depicts three consecutive design stages while using a virtual prototyping platform: (i) system modeling, (ii) rapid virtual prototyping and (iii) system integration. Different virtualization environments can be employed for this purpose. Without affecting the generality of VPs, we refer here to the OVP [2], since it is a publicly available and easily extensible approach. Additionally, the increased simulation speed provided by OVPSim ensures that complex systems can be modeled in

Proposed rapid prototyping framework

Towards to the direction of using more effectively the virtual platform designers have in their disposal, we propose a software-supported framework for enabling product development jointly by software and hardware teams in a way that close interaction is allowed during the development phases. The main competitive advantages of this framework are automatic partitioning of system functions to hardware and provided PC-based co-simulation, which trade-offs between speed (functional simulation) and

Evaluation system setup

This section introduces the use case employed for demonstrating the efficiency of our rapid prototyping framework. Towards this direction, initially we discuss the limitation posed by the implementation of Computer Vision (CV) algorithms and then we emphasise on the target limitations posed by the efficient implementation of a representative CV algorithm, namely the Harris & Stephens Corner Detection Algorithm.

Experimental results

This section describes the experimental results by applying the introduced framework for implementing the Harris algorithm onto a recent Xilinx Kintex-7 (xc7k325tffg900-2) board. Initially, we profile the Harris algorithm in order to determine the most computational intensive tasks. For this purpose the Valgrind suite [18] was employed. The results of this analysis are depicted, as percentages over total execution time, on the left part of Fig. 10. Based on this analysis, we can concentrate our

Related work

The partitioning problem constitutes an internal part of the modern system design, due to the continuously increasing needs for higher performance that exist simultaneously with the resource utilisation bottlenecks. Numerous algorithms in literature have been proposed to address this problem, although no known polynomial-time, globally optimal algorithm for balance-constraint partitioning has been suggested so far. However, several efficient heuristics were developed which find high-quality

Conclusion

A design framework for supporting rapid prototyping of multi-million gate systems was introduced. The framework efficiently eliminates the solution space by performing design space exploration pruning, while the under-evaluation design configurations are rapidly developed with the usage of HLS techniques. Then, the designs are evaluated using an early system prototype that is based on a host CPU and an FPGA device. The simulation on such a system is highly accelerated by the FPGA-in-the-loop

References (31)

  • A. Zhou et al.

    Multiobjective evolutionary algorithms: a survey of the state of the art

    Swarm Evolut. Comput.

    (2011)
  • H. Bay et al.

    Speeded-up robust features (SURF)

    Comput. Vis. Image Underst.

    (2008)
  • ITRS, International Technology Roadmap for Semiconductors, 2012. URL...
  • OVP, Open Virtual Platforms (OVP), 2013.URL...
  • D. Diamantopoulos et al.

    Plug&Chip: a framework for supporting rapid prototyping of 3d hybrid virtual SoCs

    ACM Trans. Embed. Comput. Syst.

    (2014)
  • F. Slomka et al.

    Hardware/software codesign and rapid prototyping of embedded systems

    IEEE Des. Test. Comput.

    (2000)
  • J. Weidendorfer, M. Kowarschik, C. Trinitis, A tool suite for simulation based analysis of memory access behavior,...
  • N. Nethercote, J. Seward, Valgrind: A framework for heavyweight dynamic binary instrumentation, in: Proceedings of the...
  • F. Kordon et al.

    An introduction to rapid system prototyping

    IEEE Trans. Softw. Eng.

    (2002)
  • M. Farshbaf, M.R. Feizi-Derakhshi, Multi-objective Optimization of Graph Partitioning Using Genetic Algorithms, in:...
  • K.D., A.P., S.A., T.M., A fast and elitist multiobjective genetic algorithm: NSGA-II, IEEE Transactions on Evolutionary...
  • G. Palermo, C. Silvano, V. Zaccaria, An efficient design space exploration methodology for on-chip multiprocessors...
  • G. Mariani, G. Palermo, C. Silvano, V. Zaccaria, Multi-processor system-on-chip DesignSpace Exploration based on...
  • Xilinx, Inc., URL...
  • V. Chaudhary, J. K. Aggarwal, Parallelism in computer vision: a review, in: V. Kumar, P. S. Gopalakrishnan, L. N. Kanal...
  • Cited by (1)

    View full text