ABSTRACT
In the multicore era, capturing execution traces of processors is indispensable to debugging complex software. The inability to transfer vast amounts of trace data off-chip without significant slow-down has impeded the debugging of such software, in both pre-silicon emulation and in real designs. We consider on-chip trace compression performed in hardware to reduce data volume, using techniques that exploit inherent higher-order redundancy in address trace data. While hardware trace compression is often restricted to poor or moderate performance due to area and memory constraints, we present a parameterizable scheme that leverages the resources already found on existing platforms. Harnessing resources such as existing trace buffers on CPUs, and unused embedded memory on FPGA emulation platforms, our trace compression scheme requires only a small additional hardware area to achieve superior compression ratios.
- E. Anis and N. Nicolici. On using lossless compression of debug data in embedded logic analysis. In IEEE International Test Conference, 2007, pages 1--10, 2007.Google ScholarCross Ref
- ARM Ltd. CoreSight Trace Macrocells. http://www.arm.com/products/system-ip.Google Scholar
- M. Boulé, J. Chenard, and Z. Zilic. Adding debug enhancements to assertion checkers for hardware emulation and silicon debug. In International Conference on Computer Design, pages 294--299, 2006.Google ScholarCross Ref
- S. Bourduas, J. Chenard, and Z. Zilic. A Quality-Driven design approach for NoCs. IEEE Design & Test of Computers, 25(5):416--428, 2008. Google ScholarDigital Library
- M. Burtscher, I. Ganusov, S. Jackson, J. Ke, P. Ratanaworabhan, and N. Sam. The VPC trace-compression algorithms. IEEE Transactions on Computers, 54(11):1329--1344, 2005. Google ScholarDigital Library
- E. Chung and J. Hoe. High-Level design and validation of the BlueSPARC multithreaded processor. IEEE Transactions on CAD, 29(10):1459--1470, 2010. Google ScholarDigital Library
- E. Chung, E. Nurvitadhi, J. C. Hoe, B. Falsafi, and K. Mai. A complexity-effective architecture for accelerating full-system multiprocessor simulations using FPGAs. In Proceedings of the International Symposium on FPGAs, pages 77--86, 2008. Google ScholarDigital Library
- P. A. Emrath, S. Chosh, and D. A. Padua. Event synchronization analysis for debugging parallel programs. In Supercomputing. ACM/IEEE Conference on, pages 580--588, 1989. Google ScholarDigital Library
- M. Guthaus, J. Ringenberg, D. Ernst, T. Austin, T. Mudge, and R. Brown. MiBench: a free, commercially representative embedded benchmark suite. In IEEE International Workshop on Workload Characterization, pages 3--14, 2001. Google ScholarDigital Library
- A. Hopkins and K. McDonald-Maier. Debug support for complex systems on-chip: a review. IEE Proc. of Computers & Digital Techniques, 153(4):197--207, 2006.Google ScholarCross Ref
- C. Kao, S. Huang, and I. Huang. A hardware approach to Real-Time program trace compression for embedded processors. IEEE Transactions on Circuits and Systems I: Regular Papers, 54(3):530--543, 2007.Google ScholarCross Ref
- A. Mayer, H. Siebert, and C. Lipsky. Multi-core debug solution IP. White paper, IPExtreme, 2007.Google Scholar
- A. Mayer, H. Siebert, and K. McDonald-Maier. Boosting debugging support for complex systems on chip. Computer, 40(4):76--81, 2007. Google ScholarDigital Library
- M. Milenkovic and M. Burtscher. Algorithms and hardware structures for unobtrusive real-time compression of instruction and data address traces. In Data Compression Conference, pages 283--292, 2007. Google ScholarDigital Library
- B. Plattner. Real-Time execution monitoring. IEEE Trans. Software Engineering, SE-10(6):756--764, 1984.Google ScholarDigital Library
- Y. Sazeides and J. Smith. The predictability of data values. In Microarchitecture., IEEE/ACM International Symposium on, pages 248--258, 1997. Google ScholarDigital Library
- V. Uzelac and A. Milenkovic. A real-time program trace compressor utilizing double move-to-front method. In DAC'09, pages 738--743, 2009. Google ScholarDigital Library
- J. Ziv and A. Lempel. A universal algorithm for sequential data compression. IEEE Transactions on Information Theory, 23(3):337--343, 1977.Google ScholarDigital Library
Index Terms
- Real-time address trace compression for emulated and real system-on-chip processor core debugging
Recommendations
Architecture-Aware Real-Time Compression of Execution Traces
In recent years, on-chip trace generation has been recognized as a solution to the debugging of increasingly complex software. An execution trace can be seen as the most fundamentally useful type of trace, allowing the execution path of software to be ...
Dynamically Instrumenting the QEMU Emulator for Linux Process Trace Generation with the GDB Debugger
Special Issue on Risk and Trust in Embedded Critical Systems, Special Issue on Real-Time, Embedded and Cyber-Physical Systems, Special Issue on Virtual Prototyping of Parallel and Embedded Systems (ViPES)In software debugging, trace generation techniques are used to resolve highly complex bugs. However, the emulators increasingly used for embedded software development do not yet offer the types of trace generation infrastructure available in hardware. ...
Real-time unobtrusive program execution trace compression using branch predictor events
CASES '10: Proceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systemsUnobtrusive capturing of program execution traces in real-time is crucial in debugging cyber-physical systems. However, tracing even limited program segments is often cost-prohibitive, requiring wide trace ports and large on-chip trace buffers. This ...
Comments