Skip to main content

PTrie: Data Structure for Compressing and Storing Sets via Prefix Sharing

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10580))

Abstract

Sets and their efficient implementation are fundamental in all of computer science, including model checking, where sets are used as the basic data structure for storing (encodings of) states during a state-space exploration. In the quest for fast and memory efficient methods for manipulating large sets, we present a novel data structure called PTrie for storing sets of binary strings of arbitrary length. The PTrie data structure distinguishes itself by compressing the stored elements while sharing the desirable key characteristics with conventional hash-based implementations, namely fast insertion and lookup operations. We provide the theoretical foundation of PTries, prove the correctness of their operations and conduct empirical studies analysing the performance of PTries for dealing with randomly generated binary strings as well as for state-space exploration of a large collection of Petri net models from the 2016 edition of the Model Checking Contest (MCC’16). We experimentally document that with a modest overhead in running time, a truly significant space-reduction can be achieved. Lastly, we provide an efficient implementation of the PTrie data structure under the GPL version 3 license, so that the technology is made available for memory-intensive applications such as model-checking tools.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Both these extension come with a smaller overhead in run-time and memory. Also, currently neither of these extensions support Delete.

  2. 2.

    Both available at https://github.com/sparsehash/sparsehash.

  3. 3.

    Available at https://github.com/aappleby/smhasher/wiki/MurmurHash2.

  4. 4.

    Available at https://code.launchpad.net/verifypn.

References

  1. Askitis, N., Sinha, R.: HAT-trie: a cache-conscious trie-based data structure for strings. In: Proceedings of the Thirtieth Australasian Conference on Computer Science, vol. 62, pp. 97–105. Australian Computer Society Inc. (2007)

    Google Scholar 

  2. Bagwell, P.: Ideal hash trees. Es Grands Champs, vol. 1195 (2001)

    Google Scholar 

  3. Bryant, R.E.: Graph-based algorithms for Boolean function manipulation. IEEE Trans. Comput. C–35(8), 677–691 (1986)

    Article  MATH  Google Scholar 

  4. Byg, J., Jørgensen, K.Y., Srba, J.: TAPAAL: editor, simulator and verifier of timed-arc petri nets. In: Liu, Z., Ravn, A.P. (eds.) ATVA 2009. LNCS, vol. 5799, pp. 84–89. Springer, Heidelberg (2009). doi:10.1007/978-3-642-04761-9_7

    Chapter  Google Scholar 

  5. Jones, D.C.: HAT-trie implementation. https://github.com/dcjones/hat-trie. Accessed 19 Apr 2017

  6. cplusplus.com. C++ map implementation reference. http://www.cplusplus.com/reference/map/map/. Accessed 20 Jan 2017

  7. cplusplus.com. C++ set implementation reference. http://www.cplusplus.com/reference/set/set/. Accessed 20 Jan 2017

  8. David, A., Jacobsen, L., Jacobsen, M., Jørgensen, K.Y., Møller, M.H., Srba, J.: TAPAAL 2.0: integrated development environment for timed-arc petri nets. In: Flanagan, C., König, B. (eds.) TACAS 2012. LNCS, vol. 7214, pp. 492–497. Springer, Heidelberg (2012). doi:10.1007/978-3-642-28756-5_36

    Chapter  Google Scholar 

  9. Evangelista, S., Pradat-Peyre, J.-F.: Memory efficient state space storage in explicit software model checking. In: Godefroid, P. (ed.) SPIN 2005. LNCS, vol. 3639, pp. 43–57. Springer, Heidelberg (2005). doi:10.1007/11537328_7

    Chapter  Google Scholar 

  10. Evans, J.: A scalable concurrent malloc (3) implementation for FreeBSD. In: Proceedings of the BSDCan Conference Ottawa (2006)

    Google Scholar 

  11. Fredkin, E.: Trie memory. Commun. ACM 3(9), 490–499 (1960)

    Article  Google Scholar 

  12. Gwehenberger, G.: Anwendung einer binären verweiskettenmethode beim aufbau von listen/use of a binary tree structure for processing files. IT Inf. Technol. 10(1–6), 223–226 (1968)

    MATH  Google Scholar 

  13. Heinz, S., Zobel, J., Williams, H.E.: Burst tries: a fast, efficient data structure for string keys. ACM Trans. Inf. Syst. 20, 192–223 (2002)

    Article  Google Scholar 

  14. Jensen, J.F., Nielsen, T., Oestergaard, L.K., Srba, J.: TAPAAL and reachability analysis of P/T nets. In: Koutny, M., Desel, J., Kleijn, J. (eds.) Transactions on Petri Nets and Other Models of Concurrency XI. LNCS, vol. 9930, pp. 307–318. Springer, Heidelberg (2016). doi:10.1007/978-3-662-53401-4_16

    Chapter  Google Scholar 

  15. Jensen, P.G., Larsen, K.G., Srba, J., Sørensen, M.G., Taankvist, J.H.: Memory efficient data structures for explicit verification of timed systems. In: Badger, J.M., Rozier, K.Y. (eds.) NFM 2014. LNCS, vol. 8430, pp. 307–312. Springer, Cham (2014). doi:10.1007/978-3-319-06200-6_26

    Chapter  Google Scholar 

  16. Kordon, F., Garavel, H., Hillah, L.M., Hulin-Hubard, F., Chiardo, G., Hamez, A., Jezequel, L., Miner, A., Meijer, J., Paviot-Adet, E., Racordon, D., Rodriguez, C., Rohr, C., Srba, J., Thierry-Mieg, Y., Trịnh, G., Wolf, K.: Complete Results for the 2016 Edition of the Model Checking Contest, June 2016. http://mcc.lip6.fr/2016/results.php

  17. Laarman, A., van de Pol, J., Weber, M.: Parallel recursive state compression for free. In: Groce, A., Musuvathi, M. (eds.) SPIN 2011. LNCS, vol. 6823, pp. 38–56. Springer, Heidelberg (2011). doi:10.1007/978-3-642-22306-8_4

    Chapter  Google Scholar 

  18. Morrison, D.R.: Patriciapractical algorithm to retrieve information coded in alphanumeric. J. ACM (JACM) 15(4), 514–534 (1968)

    Article  Google Scholar 

  19. Prokopec, A., Bronson, N.G., Bagwell, P., Odersky, M.: Concurrent tries with efficient non-blocking snapshots. ACM SIGPLAN Not. 47(8), 151–160 (2012). ACM

    Article  Google Scholar 

  20. Renaud, M.: Trie (aka. prefix tree). https://github.com/m-renaud/trie. Accessed 19 Apr 2017

  21. Ročkai, P., Štill, V., Barnat, J.: Techniques for memory-efficient model checking of C and C++ code. In: Calinescu, R., Rumpe, B. (eds.) SEFM 2015. LNCS, vol. 9276, pp. 268–282. Springer, Cham (2015). doi:10.1007/978-3-319-22969-0_19

    Chapter  Google Scholar 

  22. Timonk. Big memory, part 3.5: Google sparsehash! (2011). https://research.neustar.biz/2011/11/27/big-memory-part-3-5-google-sparsehash/. Accessed 20 Jan 2017

  23. Welch, N.: Hash table benchmarks. http://incise.org/hash-table-benchmarks.html. Accessed 20 Jan 2017

  24. Wolf, K.: Running LoLA 2.0 in a model checking competition. In: Koutny, M., Desel, J., Kleijn, J. (eds.) Transactions on Petri Nets and Other Models of Concurrency XI. LNCS, vol. 9930, pp. 274–285. Springer, Heidelberg (2016). doi:10.1007/978-3-662-53401-4_13

    Chapter  Google Scholar 

  25. Yang, J.: An implementation of two-trie and tail-trie using double array. https://github.com/jianingy/libtrie. Accessed 19 Apr 2017

Download references

Acknowledgements

We acknowledge the support from Sino-Danish Basic Research Center IDEA4CPS, the Innovation Fund Denmark center DiCyPS, and the ERC Advanced Grant LASSO. The third author is partially affiliated with FI MU in Brno.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiří Srba .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Jensen, P.G., Larsen, K.G., Srba, J. (2017). PTrie: Data Structure for Compressing and Storing Sets via Prefix Sharing. In: Hung, D., Kapur, D. (eds) Theoretical Aspects of Computing – ICTAC 2017. ICTAC 2017. Lecture Notes in Computer Science(), vol 10580. Springer, Cham. https://doi.org/10.1007/978-3-319-67729-3_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-67729-3_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-67728-6

  • Online ISBN: 978-3-319-67729-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics