Abstract
In this paper, we report on our effort to design a fast, concurrent-safe hash table and implement it in C++, correctly. It is especially the latter that is the focus of this paper: concurrent data structures are notoriously hard to implement, and C++ is not known to be a particularly safe language. It however does offer unparalleled performance for the level of programming comfort it offers, especially in our area of interest – parallel workloads with intense interaction.
For these reasons, we have enlisted the help of a software model checker (DIVINE) with the ability to directly check the C++ implementation. We discuss how such a heavyweight tool integrated with the engineering effort, what are the current limits of this approach and what kinds of assurances we obtained. Of course, we have applied the standard array of tools throughout the effort – unit testing, an interactive debugger, a memory error checker (valgrind) – in addition to the model checker, which puts us in an excellent position to weigh them against each other and point out where they complement each other.
This work has been partially supported by the Czech Science Foundation grant No. 18-02177S.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
In particular, they provide nicer iterator and reference invalidation semantics and are less susceptible to pathological behaviour when using sub-optimal user-supplied hash functions.
- 2.
Here, concurrency means that multiple CPU cores perform operations on the same data structure without additional synchronization.
- 3.
Each cell can hold at most a single key. Basic operations on a cell include storing a key and comparing the content of the cell with a key.
- 4.
In comparison, the design from [2] uses 3 machine words per thread, 64 + 32 words of fixed overhead per a hash table instance and 11 words per cell vector. It also incurs 3 indirections compared to the single indirection in the current design.
- 5.
The class naming has changed during later development stages. We use the last iteration of the class names throughout the paper to avoid confusing the reader.
- 6.
Clearly, the keys stored in a continuous segment of the smaller hash_table may end up distributed arbitrarily in the entire range of the bigger (target) instance.
- 7.
Cells are invalidated when they are rehashed, so that concurrent insertion will not accidentally use an empty cell that has already been ‘moved’ to the next generation leading to a loss of the inserted key.
- 8.
Unfortunately, model checking of scenarios with more than 2 threads does not seem to be realistic, at least not with the allocated budget of 100 GiB of RAM.
- 9.
This incident also revealed a weakness in our testing methodology, where only ‘good’ hash functions were used during unit testing and model checking – cf. Sect. 4.1.
- 10.
Deadlocks involving pthread mutexes are detected, but these mutexes are too expensive to be used in concurrent data structures.
- 11.
A pointer which can only point at objects which manage their own reference counter.
- 12.
The implementation was tested, but not model checked, and is not part of the hash table implementation (it was done in application-level code in an application-specific manner).
- 13.
All the scenarios used a hash_set with the growth pattern 2-4-8. The last scenario was however also attempted with the pattern 1-2-4, reducing the size of the state space and memory requirements considerably, from 85 GiB to about 15 GiB.
References
Baranová, Z., et al.: Model checking of C and C++ with DIVINE 4 (2017)
Barnat, J., Ročkai, P., Štill, V., Weiser, J.: Fast, dynamically-sized concurrent hash table. In: Fischer, B., Geldenhuys, J. (eds.) SPIN 2015. LNCS, vol. 9232, pp. 49–65. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23404-5_5
Chen, Z., Gu, Y., Huang, Z., Zheng, J., Liu, C., Liu, Z.: Model checking aircraft controller software: a case study. Softw. Pract. Exp. 45(7), 989–1017 (2015). https://doi.org/10.1002/spe.2242
Fitzgerald, J., Bicarregui, J., Larsen, P.G., Woodcock, J.: Industrial Deployment of Formal Methods: Trends and Challenges. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-33170-110. ISBN 978-3-642-33170-1
Gan, X., Dubrovin, J., Heljanko, K.: A symbolic model checking approach to verifying satellite onboard software. Sci. Comput. Program. 82, 44–55 (2014). https://doi.org/10.1016/j.scico.2013.03.005
Hwong, Y.L., Keiren, J.J., Kusters, V.J., Leemans, S., Willemse, T.A.: Formalising and analysing the control software of the CMS experiment at the large hadron collider. Sci. Comput. Program. 78(12), 2435–2452 (2013). https://doi.org/10.1016/j.scico.2012.11.009
Klein, G., et al.: seL4: formal verification of an OS kernel. In: SOSP, pp. 207–220. ACM (2009). https://doi.org/10.1145/1629575.1629596
Lång, J., Prasetya, I.S.W.B.: Model checking a C++ software framework, a case study (2019). https://doi.org/10.1145/3338906.3340453
O’Callahan, R., Jones, C., Froyd, N., Huey, K., Noll, A., Partush, N.: Engineering record and replay for deployability (2017). arXiv:1705.05937
Penix, J., Visser, W., Engstrom, E., Larson, A., Weininger, N.: Verification of time partitioning in the DEOS scheduler kernel. In: International Conference on Software Engineering, pp. 488–497. ACM Press (2000)
Potts, D., Bourquin, R., Andresen, L., Andronick, J., Klein, G., Heiser, G.: Mathematically verified software kernels: raising the bar for high assurance implementations. Technical report, NICTA, Sydney, Australia (2014)
Ročkai, P., Barnat, J.: A simulator for LLVM bitcode (2017). https://arxiv.org/abs/1704.05551. Preliminary version
Ročkai, P., Barnat, J., Brim, L.: Improved state space reductions for LTL model checking of C and C++ programs. In: Brat, G., Rungta, N., Venet, A. (eds.) NFM 2013. LNCS, vol. 7871, pp. 1–15. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38088-4_1
Stallman, R., Pesch, R., Shebs, S.: Debugging with GDB (2010)
Visan, A.-M., Arya, K., Cooperman, G., Denniston, T.: URDB: a universal reversible debugger based on decomposing debugging histories. In: PLOS 2011 (2011)
Woodcock, J., Larsen, P.G., Bicarregui, J., Fitzgerald, J.: Formal methods: practice and experience. ACM Comput. Surv. 41(4), 19:1–19:36 (2009). https://doi.org/10.1145/1592434.1592436
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Ročkai, P. (2020). Model Checking in a Development Workflow: A Study on a Concurrent C++ Hash Table. In: Sekerinski, E., et al. Formal Methods. FM 2019 International Workshops. FM 2019. Lecture Notes in Computer Science(), vol 12232. Springer, Cham. https://doi.org/10.1007/978-3-030-54994-7_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-54994-7_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-54993-0
Online ISBN: 978-3-030-54994-7
eBook Packages: Computer ScienceComputer Science (R0)