# Visualizations for Understanding SoC Behaviour

Dave M<sup>c</sup>Ewan Centre for Doctoral Training in Communications University of Bristol Bristol, UK dave.mcewan@bristol.ac.uk Marcin Hlond Director of System Engineering UltraSoC Technologies Ltd Bristol, UK marcin.hlond@ultrasoc.com Dr Jose Nunez-Yanez Dept. of Microelectronics University of Bristol Bristol, UK j.l.nunez-yanez@bristol.ac.uk

*Abstract*—This paper introduces a novel method of analysis for System-on-Chip (SoC) development building upon commonly used tools and techniques to approximate and automate the human process of investigation. Knowledge of the interactions between components within a SoC is essential for understanding how a system works so the presented method provides a way of visualizing these interactions. The mathematical basis for the method is explained and justified, then the method is demonstrated using two representative case studies. Visualizations from the case studies are used to exhibit the usefulness of the method for system optimization, monitoring, and validation.

# I. INTRODUCTION

The SoCs comprising modern silicon products are often built with the result of lifetime's of work by hundreds of engineers which makes it all but impossible for a single systems architect to have a complete understanding of every part in a design. This means that while at first glance a system may appear to be functioning, unforeseen behaviours may appear within the interactions between system components, possibly leading to undesirable behaviour such as reduced performance, increased energy usage, information leakage, or unexpected susceptiblity to faults. UltraSoC Technologies Ltd (UltraSoC) is a silicon IP supplier specializing in embedded analytics which uses highly configurable monitoring components to address these issues. Analysing so much data using traditional methods such as assertions is difficult and time consuming due to systemic complexity and dynamic behaviour. In this research we propose a visual tool and mathematical framework that can help to understand these behaviours and build upon the instrumentation technology developed by project partner UltraSoC. Two cases studies are used to illustrate the methodology: A synthesizable model of a simple SoC, and a complex SoC with lightweight software instrumentation enabled by the use of UltraSoC tools.

The main contributions of this paper are: 1) A mathematical framework for approximating the human process of investigation for binarized time series SoC data. 2) A novel visualization technique for behavioural relationships.

978-1-7281-3549-6/19/\$31.00 ©2019 IEEE

### **II. PREVIOUS WORK**

An examination of currently available hardware and lowlevel software profiling methods is given by Lagraa [1] which covers well known techniques such as using counters to generate statistics about both hardware and software events - effectively a low cost data compression. Lagraa's thesis is based on profiling SoCs created specifically on Xilinx MP-SoC devices, which although powerful, ensures it may not be applied to data from other sources, such as in post-silicon. Lo et al [2] described a system for describing behaviour with a series of statements using a search space exploration process based on boolean set theory. While this work has a similar goal of finding temporal dependencies it is acknowledged that the mining method does not perform adequately for the very long traces often found in real-world SoC data. Another limitaton here is our receptiveness to information in the form of long lists of statements versus a visual representation. Ivanovic et al [3] review time series analysis models and methods where characteristic features of economic time series are described such as high auto-dependence and inter-dependence, high correlation, non-stationarity, and drawn from noisy sources. SoC data is expected to have these same features, together with full binarization and much greater length. Explainability is a key requirement to understanding so related approaches such as the use of Neural Network (NN)s has been avoided at this stage although these may be useful for higher level analysis.

## III. METHODOLOGY

This methodology has been designed to approximate and mimic the process of an experienced SoC engineer trying to understand how waveforms are related to each other. It is assumed that the measurements are taken at discrete times t, often referred to as a number of clock cycles, and that all values are binary,  $f_i(t) \in \{0,1\}$ . Additionally it is assumed that data for every time is able to be recorded, or accurately inferred, which depends on the changes in measurement values to be sparse for storing data at a physically feasible rate.

The first approximation utilized is the application of a bellshaped windowing function w which is similar to how we focus attention on the centre of a time period  $t \in [u, v)$ . A power-of-sine window windowing function w is used to create

This project is supported by the Engineering and Physical Sciences Research Council (EP/I028153/ and EP/L016656/1); the University of Bristol and UltraSoC Technologies Ltd. Supervised by Dr Jose Nunez-Yanez and Professor Kerstin Eder.

a weighted average of each measurement giving the expected value.

$$w(t) = \begin{cases} \sin^{\alpha} \left( \frac{t\pi}{v-u-1} \right) & : t \in [u,v) \\ 0 & : \text{ otherwise} \end{cases}$$
(1)

$$\mathbb{E}[f_i] = \frac{1}{\sum w} \sum_{t \in [u,v)} w(t) * f_i(t) \quad \in [0,1]$$

$$(2)$$

Bayes theorem in Equation (3) and the definition of independence in Equation (5) to allow the conditional expectation and a measure of dependency,  $\dot{D}ep$ , to be calculated. In order to reduce the amount of information stored, a threshold is introduced which approximates the process of putting relationships in natural language form.

$$\Pr(X|Y) = \frac{\Pr(Y|X)\Pr(X)}{\Pr(Y)}; \quad \Pr(Y) \neq 0$$
(3)

$$\mathbb{E}[f_x|f_y] = \frac{\mathbb{E}[f_x * f_y]}{\mathbb{E}[f_y]}; \quad \mathbb{E}[f_y] \neq 0$$
(4)

$$X \perp Y \iff \Pr(X) = \Pr(X|Y) \tag{5}$$

$$\text{let} \quad \varphi = \frac{\mathbb{E}[f_x|f_y] - \mathbb{E}[f_x]}{\mathbb{E}[f_x|f_y]} = 1 - \frac{\mathbb{E}[f_x]\mathbb{E}[f_y]}{\mathbb{E}[f_x * f_y]}$$
$$\dot{\text{Dep}}(f_x, f_y) := \begin{cases} \varphi & : 0 \leqslant \varphi \\ 0 & : \text{otherwise} \end{cases}$$
(6)

Cov is a measure of covariance as shown in Equation (9). Here the use of a weighted average and a threshold function approximate the human processes of focusing attention and discarding pairwise correlations which are insignificantly small or negative.

$$\operatorname{cov}(X,Y) = \mathbb{E}[XY] - \mathbb{E}[X]\mathbb{E}[Y]$$
(7)

$$X, Y \in [0, 1] \implies \frac{-1}{4} \leq \operatorname{cov}(X, Y) \leq \frac{1}{4}$$
 (8)

$$\det \quad \varphi = 4 \Big( \mathbb{E}[f_x * f_y] - \mathbb{E}[f_x] \mathbb{E}[f_y] \Big)$$
$$\dot{\mathrm{Cov}}(f_x, f_y) := \begin{cases} \varphi & : 0 \leqslant \varphi \\ 0 & : \text{ otherwise} \end{cases}$$
(9)

The measures  $\dot{\text{Dep}}$ , and  $\dot{\text{Cov}}$  are symmetric, i.e.  $\dot{\text{Dep}}(X, Y) = \dot{\text{Dep}}(Y, X)$ , share the same codomain [0, 1] and operate in the same domain [0, 1] which allows their output to form new measurements for a meta-analysis.

Each measurement lends itself to implied relationships, e.g. "X *high* leads or ..." or "X *rising* leads to ...". This methodology is focused on binary measurements so four implied measurements are considered:

1) Measurement  $f(t) \in [0, 1]$ .

2) Reflection,  $\neg(t) := (1 - f)$ .

- 3) Rising edge,  $\uparrow(t) := \max(0, f(t) f(t-1)).$
- 4) Falling edge,  $\downarrow(t) := \max(0, \neg(t) \neg(t-1)).$

Using these four implied measurements for each real one means that the use of thresholds to effectively discard negative dependencies and covariances does not miss significant relationships, but puts them into the more natural form e.g. "X



Fig. 1: Mockup of the representation of a single measurement. Colours from the 2D colourspace allow the viewer to quickly estimate the values of 6 values of interest.



Fig. 2: Colourspace for visualization of 2-dimensional bounded values described by Equation (10) thru Equation (14).

*low*..." vs "X *not high*...". The technique used to visualize some information about each measurement in a single time window combines these into a quad, as shown in Fig. 1. This allows some important information to be gleaned from just the colour of the quad sections, allowing some usage even in a blurred, low-resolution or faraway view where the text is unclear.

A novel visualization has been developed in order to represent data points in the space  $[0, 1]^2$  which allows the viewer to quickly determine the rough location of a point from its colour. Equation (10) thru Equation (14) show the mapping from  $[0, 1]^2$  to 8-bit Red/Green/Blue (RGB) values as depicted in Fig. 2. This colourspace displays equally well on screen and printed paper and looks similar to people with protanopia, the most common form of colourblindness.

When looking for a pairwise relationship it is necessary to look slightly forward or backward in time to find a result such as "X is likely 5 cycles after Y". The notation  $f_{i_{\langle\delta\rangle}}(t) := f_i(t + \delta)$  has been used to represent the notion of measurement *i* being shifted by  $\delta$  cycles. A link between two measurements  $f_x$  and  $f_{y_{\langle\delta\rangle}}$  is said to be significant when both  $\dot{\mathrm{Dep}}(f_x, f_{y_{\langle \delta \rangle}})$  and  $\dot{\mathrm{Cov}}(f_x, f_{y_{\langle \delta \rangle}})$  are greater than zero. For a high level understanding, knowing that a significant link exists is more important than knowing the exact values of dependency and covariance, and a rough estimate may often be good enough. By arranging all measurements in a circle and drawing edges, again using the colourspace described above, a network of behavioural relationships is formed, as shown in the diagrams generated from case studies Fig. 3 and Fig. 4. This arrangement shows behaviour as connections of a graph with node attributes providing a summary of the measured behaviour in an easily digestible manner, ultimately saving engineering time by automating a large part of the human analysis process.

$$\theta = \left(1 - \frac{\sqrt{a^2 + b^2}}{\sqrt{2}}\right)^{\gamma} \tag{10}$$

$$\phi = \arctan \frac{b}{a} \tag{11}$$

$$\operatorname{red} = \lfloor 255 \times \theta \rfloor \tag{12}$$

green = 
$$\lfloor 255 \times \theta^{\max(0, \frac{\pi}{4} - \phi) + 1} \rfloor$$
 (13)

blue = 
$$|255 \times \theta^{\max(0, \phi - \frac{\pi}{4}) + 1}|$$
 (14)

# IV. CASE STUDY 1

The first experiment named probsys is based on a system consisting of a single Advanced/ARM eXtensible Interface (AXI) [4] master communicating with a single AXI slave. To give some familiar context three additional binary states are measured on the slave component (busy, stall, and idle). A simulation is run to produce a Value Change Dump (VCD) file containing measurment data. The rates at which transactions are made on the five AXI channels (AW, W, B, AR, R) are controlled with a probability distributon via fixed inputs. On the source side stall this means dropping \*READY and on the destination side this means dropping \*VALID. Observing the measurements in a waveform viewer in order to validate system behaviour is not trivial due to the density and format of information. Some expected behaviours may be expressed as formal properties and proven. E.g. Assertions may be used to check that busy and idle are never high at the same time. However, listing and forming all of these properties is time consuming and gives no hints to what behaviours a user might have forgotten to specify.

This visualization can be inspected interactively and intuitively to find more information at deeper levels such as the precise conditional probability, dependency, and covariance of each edge. Even at the static and low resolution of printed paper, much useful information is immediately available with its significance denoted by the darkness of the ink. The edges correctly identify the measurements which are expected to be related to each other, e.g. AXI read replies strongly related to AXI read requests. It can be quickly seen that the read channels are not interacting with the write channels, validating (or invalidating) expectations. This gives the system designer some confidence that the read and write parts of the system have not been accidentally linked through some unexpected



(a) Plot showing measurement relationships from probsys. Interactions are clearly visible from the more densely connected nodes. Clusters of relations are outlined in red. Top-right busy and idle are strongly related. Middle AXI read request (AR) is strongly related to read reply (R) and read-decode error. Bottom-right AXI write request (AW) is strongly related to write reply (B) and write-decode error. Read measurements are related to write measurements via busy.



(b) Closeup view of rightmost measurement node from the same plot. Each edge is drawn as a line from a black dot just outside one of the quad's corners, to the centre of another quad corner. Location of the dot depends on the value of  $\delta$ . The size of the dots, weight, and thickness of the drawn edges are further visual indicators of the Dep and Cov values.

Fig. 3

mechanism. This example is analogous to the many usecases where it is desired to visualize the cooperation between channels, non-cooperation between channels, or iteraction with a side-channel like busy.

## V. CASE STUDY 2

A system which is more representative of the complex SoCs used today is required to demonstrate the method in a realworld application. The application chosen for this experiment is a software-based NN performing handwritten-digit recognition, running on a standard prototyping FPGA SoC [5] with two RISC-V processors and communicating with the outside world over USB. The C software of 17 functions running on 2 processors, is based on Tinn [6] [7] and modified such that the smaller processor ACPU collects batches and passes them to the processor which has the full floating point unit SCPU.

Ease of instrumentation is essential for real world systems as non-standard modifications can alter observed behaviour is subtle but important ways. The only modifications required are an additional GCC flag in the compile stage (-finstrument-functions), and to optionally disable instrumentation on uninteresting functions [8]. Network diagrams were generated using the methodology described above, two of which are shown in Fig. 4. Since each of these diagrams corresponds to a single time window it is natural to view them in sequence like a movie. Using the Scalable Vector Graphics (SVG) image format also allows the diagrams to be browsed and examined interactively to extract more detail and exact values as desired. Nodes with darker colours in the centre bands indicate more time spent in those functions. E.g. SCPU spends most of its time it the train function and ACPU spends most of its time waiting during the training phase in Fig. 4a.

# VI. CONCLUSION

Although the experiment in Section IV may be a proof of concept example the case of monitoring transactions on multiple communication channels to ensure they are (or are not) cooperating is not uncommon where traditional tools are often unable to provide useful visualizations. Similarly, the experiment in Section V represents a realistic use case where an engineer is able to quickly, and with minimal manual effort, gain useful knowledge about the workings of the software, without looking at the source code. These case studies and the example results shown in the figures demonstrate that the methodology may be applied to a wide variety of situations in order to aid the understanding and validation of complex systems. Further work is being done with additional case studies which combine hardware probes such as those in the probsys experiment with software instrumentation such as that in the tinn experiment.

#### References

- [1] S. Lagraa, *New MP-SoC profiling tools based on data mining techniques*. PhD thesis, L'Université de Grenoble, 2014.
- [2] D. Lo, S.-C. Khoo, and C. Liu, "Mining past-time temporal rules from execution traces," ACM Workshop On Dynamic Analysis, pp. 50–56, Jul 2008.
- [3] M. Ivanovic and V. Kurbalija, "Time series analysis and possible applications," 39th International Convention on Information and Communication Technology, Electronics and Microelectronics, pp. 473–479, 2016.
- [4] ARM Limited, AMBA AXI and ACE Protocol Specification, 2011.
- [5] UltraSoC Technologies Ltd, UL-00231-TC-F-Taygete Prototype, Dec 2018.
- [6] G. Louw, "Tinn the tiny neural network library." https://github.com/glouw/tinn.
- [7] Semeion Research of Sciences of Communica-Center Tattile. handwritten digit tion and "Semeion data set.' https://archive.ics.uci.edu/ml/datasets/semeion+handwritten+digit.
- [8] UltraSoC Technologies Ltd, UL-001174-TR-3B-Static Instrumentation User Guide, Dec 2018.



(a) Behaviour graph of tinn during training phase.



(b) Behaviour graph of tinn during inference phase.

Fig. 4: The difference in behaviour graphs is clear, allowing a viewer to get a fast and useful overview of the main component interactions and how they change over time. The pattern of edges stays fairly constant in each phase, with the transition marked by many edges fading in and out as the 'training' pattern morphs into the 'inference' pattern. Immediately obvious is that most nodes are unconnected, and closer inspection of the connected nodes reveals that the connections are as one would expect from a NN application.