Abstract
In order to harness abundant hardware resources, parallel programming has become a necessity in multicore era. However, parallel programs are prone to concurrency bugs, especially data races. Even worse, current software tools always suffer from both large runtime overheads and poor scalability, while most of hardware supports for race detection are not available in parallel programming. Therefore, it has been a challenge that how to introduce a practical and fast race detection tools. Nowadays, GPUs with massive parallel computation resources have become one of the most popular hardware platforms. Hence, the prevalence of GPU architectures has opened an opportunity of accelerating data race detection.
In this paper, we first have a deeply analysis on data race detection algorithms like happens-before and observe that these algorithms have very good computation and data parallelism. Based on the observation, we propose Grace, a software approach that leverages massive parallelism computation units of GPU architectures to accelerate data race detection. Grace deploys detection, the most computation intensive workload, on GPU to fully utilize the computation resource in GPU. Moreover, Grace leverages coarse-grained pipeline parallelism and data parallelism through exploiting the computation resource in multi-core CPUs to further improve performance. Experimental results show that Grace is fast and scalable. It achieves over 80x speedup compared to the sequential version even under a 128-thread configuration.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Muzahid, A., et al.: SigRace: signature-based data race detection. ACM SIGARCH Computer Architecture News 37(3) (2009)
Zhou, P., Teodorescu, R., Zhou, Y.: HARD: Hardware-assisted lockset-based race detection. In: IEEE 13th International Symposium on High Performance Computer Architecture, HPCA 2007. IEEE (2007)
Prvulovic, M.: CORD: Cost-effective (and nearly overhead-free) order-recording and data race detection. In: The Twelfth International Symposium on High-Performance Computer Architecture. IEEE (2006)
Engler, D., Ashcraft, K.: RacerX: effective, static detection of race conditions and deadlocks. ACM SIGOPS Operating Systems Review 37(5) (2003)
Erickson, J., et al.: Effective Data-Race Detection for the Kernel. In: OSDI, vol. 10 (2010)
Veeraraghavan, K., et al.: Detecting and surviving data races using complementary schedules. In: Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles. ACM (2011)
Sack, P., et al.: Accurate and efficient filtering for the intel thread checker race detector. In: Proceedings of the 1st Workshop on Architectural and System Support for Improving Software Dependability. ACM (2006)
Flanagan, C., Freund, S.N.: FastTrack: efficient and precise dynamic race detection. ACM Sigplan Notices 44(6) (2009)
Marino, D., Musuvathi, M., Narayanasamy, S.: LiteRace: effective sampling for lightweight data-race detection. ACM Sigplan Notices 44(6) (2009)
Woo, S.C., et al.: The SPLASH-2 programs: Characterization and methodological considerations. ACM SIGARCH Computer Architecture News 23(2) (1995)
Lamport, L.: Time, clocks, and the ordering of events in a distributed system. Communications of the ACM 21(7), 558–565 (1978)
http://developer.nvidia.com/nvidia-gpu-computing-documentation
Poulsen, K.: Software bug contributed to blackout. Security Focus (2004)
Savage, S., et al.: Eraser: A dynamic data race detector for multithreaded programs. ACM Transactions on Computer Systems (TOCS) 15(4), 391–411 (1997)
Devietti, J., et al.: RADISH: always-on sound and complete Race Detection in Software and Hardware. ACM SIGARCH Computer Architecture News 40(3) (2012)
Lu, S., et al.: Learning from mistakes: a comprehensive study on real world concurrency bug characteristics. ACM Sigplan Notices 43(3) (2008)
Wester, B., et al.: Parallelizing data race detection. In: Proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems. ACM (2013)
Woo, D.H., Lee, H.-H.S.: COMPASS: a programmable data prefetcher using idle GPU shaders. ACM Sigplan Notices 45(3) (2010)
Merrill, D., Garland, M., Grimshaw, A.: Scalable GPU graph traversal. ACM SIGPLAN Notices 47(8) (2012)
Kogan, A., Petrank, E.: Wait-free queues with multiple enqueuers and dequeuers. ACM SIGPLAN Notices 46(8), 223–234 (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Dai, Z., Zhang, Z., Wang, H., Li, Y., Zhang, W. (2014). Parallelized Race Detection Based on GPU Architecture. In: Wu, J., Chen, H., Wang, X. (eds) Advanced Computer Architecture. Communications in Computer and Information Science, vol 451. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44491-7_9
Download citation
DOI: https://doi.org/10.1007/978-3-662-44491-7_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-44490-0
Online ISBN: 978-3-662-44491-7
eBook Packages: Computer ScienceComputer Science (R0)