Skip to main content
Log in

Block Graphs in Practice

  • Published:
Mathematics in Computer Science Aims and scope Submit manuscript

Abstract

Motivated by the rapidly increasing size of genomic databases, code repositories and versioned texts, several compression schemes have been proposed that work well on highly-repetitive strings and also support fast random access: e.g., LZ-End, RLZ, GDC, augmented SLPs, and block graphs. Block graphs have good worst-case bounds but it has been an open question whether they are practical. We describe an implementation of block graphs that, for several standard datasets, provides better compression and faster random access than competing schemes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Bille, P., Landau, G.M., Raman, R., Sadakane, K., Satti, S.R., Weimann, O.: Random access to grammar-compressed strings. In: Proceedings of the 22nd Symposium on Discrete Algorithms (SODA), pp. 373–389 (2011)

  2. Charikar, M., Lehman, E., Liu, D., Panigrahy, R., Prabhakaran, M., Sahai, A., Shelat, A.: The smallest grammar problem. IEEE Trans. Inf. Theory 51(7), 2554–2576 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  3. Deorowicz, S., Danek, A., Grabowski, S.: Genome compression: a novel approach for large collections. Bioinformatics 29(20), 2572–2578 (2013)

    Article  Google Scholar 

  4. Deorowicz, S., Grabowski, S.: Robust relative compression of genomes with random access. Bioinformatics 27(21), 2979–2986 (2011)

    Article  Google Scholar 

  5. Gagie, T., Gawrychowski, P., Puglisi, S.J.: Faster approximate pattern matching in compressed repetitive texts. In: Proceedings of the 22nd International Symposium on Algorithms and Computation (ISAAC), pp. 653–662 (2011)

  6. Grossi, R.: Random access to high-order entropy compressed text. In: Brodnik, A., López-Ortiz, A., Raman, V., Viola, A. (eds.) Pace-Efficient Data Structures, Streams, and Algorithms, pp. 199–215. Springer, Berlin (2013)

    Chapter  Google Scholar 

  7. Kreft, S., Navarro, G.: On compressing and indexing repetitive sequences. Theor. Comput. Sci. 483, 115–133 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  8. Kuruppu, S., Puglisi, S.J., Zobel, J.: Relative Lempel-Ziv compression of genomes for large-scale storage and retrieval. In: Proceedings of the 17th Symposium on String Processing and Information Retrieval (SPIRE), pp. 201–206 (2010)

  9. Kuruppu, S., Puglisi, S.J., Zobel, J.: Optimized relative Lempel-Ziv compression of genomes. In: Proceedings of the 34th Australasian Computer Science Conference (ACSC), pp. 91–98 (2011)

  10. Maruyama, S., Tabei, Y., Sakamoto, H., Sadakane, K.: Fully-online grammar compression. In: Proceedings of the 20th Symposium on String Processing and Information Retrieval (SPIRE), pp. 218–229 (2013)

  11. Okanohara, D., Sadakane, K.: Practical entropy-compressed rank/select dictionary. In: Proceedings of the Workshop on Algorithm Engineering and Experiments (ALENEX) (2007)

  12. Raman, R., Raman, V., Satti, S.R.: Succinct indexable dictionaries with applications to encoding \(k\)-ary trees, prefix sums and multisets. ACM Trans. Algorithms 3(4), 43 (2007)

    Article  MathSciNet  Google Scholar 

  13. Rytter, W.: Application of Lempel-Ziv factorization to the approximation of grammar-based compression. Theor. Comput. Sci. 302(1–3), 211–222 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  14. Verbin, E., Yu, W.: Data structure lower bounds on random access to grammar-compressed strings. In: Proceedings of the 24th Symposium on Combinatorial Pattern Matching (CPM), pp. 247–258 (2013)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Travis Gagie.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gagie, T., Hoobin, C. & Puglisi, S.J. Block Graphs in Practice. Math.Comput.Sci. 11, 191–196 (2017). https://doi.org/10.1007/s11786-016-0286-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11786-016-0286-9

Navigation