Abstract
Cell BE is a heterogeneous multicore processor that has been developed as a means for efficient execution of parallel and vectorizable applications with high computation and memory requirements. The transition to multicores introduces the challenge of providing tools that help programmers tune their code running on these architectures. Tracing tools, in particular, often help locate performance problems related to thread and process communication.
A major impediment to implementing tracing on Cell is the absence of a common clock that can be accessed at low cost from all cores. The OS clock is costly to access from the auxiliary cores and the hardware timers cannot be simultaneously set on all the cores. In this paper, we describe an offline trace analysis that assigns wall-clock time to trace records based on their thread-local time stamps and event order. Our experiments on several Cell SDK workloads show that the indeterminism in assigning the wall-clock time is low, on average 20–40 clock ticks (1.4–2.8 μs for 14.8 MHz clock). We also show how various practical problems, such as the imprecision of time measurement, can be overcome.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Chen, T., Raghavan, R., Dale, J., Iwata, E.: Cell Broadband Engine architecture and its first implementation, http://www.ibm.com/developerworks/power/library/pa-cellperf/
Biberstein, M., Chang, M.S., Mendelson, B., Shvadron, U., Turek, J.: Trace-based performance analysis on Cell BE. In: Proceedings of IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) (2008)
Lamport, L.: Time, clocks, and the ordering of events in a distributed system. Commun. ACM 21(7), 558–565 (1978)
Mattern, F.: Virtual time and global states of distributed systems. In: Parallel and Distributed Algorithms: Proceedings of the International Workshop on Parallel and Distributed Algorithms (1988)
Williams, C., Reynolds, P.F., de Supinski, B.R.: Delta coherence protocols. IEEE Concurrency 8(3), 23–29 (2000)
Mills, D.L.: Internet time synchronization: The network time protocol. In: Yang, Z., Marsland, T.A. (eds.) Global States and Time in Distributed Systems. IEEE Computer Society Press, Los Alamitos (1994)
Maillet, E., Tron, C.: On efficiently implementing global time for performance evaluation on multiprocessor systems. Journal of Parallel and Distributed Computing 28(1), 84–93 (1995)
Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 2nd edn. MIT Press, Cambridge (2001)
IBM: Cell BE SDK 3.0, http://-www.ibm.com/developerworks/power/cellpkgdownloads.html
IBM: Visual Performance Analyzer, http://www.alphaworks.ibm.com/tech/vpa
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Biberstein, M., Harel, Y., Heilper, A. (2008). Clock Synchronization in Cell BE Traces. In: Luque, E., Margalef, T., Benítez, D. (eds) Euro-Par 2008 – Parallel Processing. Euro-Par 2008. Lecture Notes in Computer Science, vol 5168. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85451-7_2
Download citation
DOI: https://doi.org/10.1007/978-3-540-85451-7_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85450-0
Online ISBN: 978-3-540-85451-7
eBook Packages: Computer ScienceComputer Science (R0)