Skip to main content

An empirical study of delta algorithms

  • Versioning Models and Experiences
  • Conference paper
  • First Online:
Software Configuration Management (SCM 1996)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1167))

Included in the following conference series:

  • 231 Accesses

Abstract

Delta algorithms compress data by encoding one file in terms of another. This type of compression is useful in a number of situations: storing multiple versions of data, distributing updates, storing backups, transmitting video sequences, and others. This paper studies the performance parameters of several delta algorithms, using a benchmark of over 1300 pairs of files taken from two successive releases of GNU software. Results indicate that modern delta compression algorithms based on Ziv-Lempel techniques significantly outperform diff, a popular but older delta compressor, in terms of compression ratio. The modern compressors also correlate better with the actual difference between files; one of them is even faster than diff in both compression and decompression speed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. James A. Gosling. A redisplay algorithm. In Proc. of the ACM SIGPLAN/SIGOA Symposium on Text Manipulation, pages 123–129, 1981.

    Google Scholar 

  2. James W. Hunt and M.D. McIllroy. An algorithm for differential file comparison. Technical Report Computing Science Technical Report 41, Bell Laboratories, June 1976.

    Google Scholar 

  3. James W. Hunt and Thomas G. Szymanski. A fast algorithm for computing longest common subsequences. Communications of the ACM, 20(5):350–353, May 1977.

    Google Scholar 

  4. Douglas W. Jones. Application of splay trees to data compression. Communications of the ACM, 31(8):996–1007, August 1988.

    Google Scholar 

  5. David G. Korn and Kiem-Phong Vo. Vdelta: Efficient data differencing and compression. In preparation, 1995.

    Google Scholar 

  6. E. M. McCreight. A space economical suffix tree construction algorithm. Journal of the ACM, 32:262–272, 1976.

    Google Scholar 

  7. Webb Miller and Eugene W. Meyers. A file comparison program. Software—Practice and Experience, 15(11):1025–1039, November 1985.

    Google Scholar 

  8. Narao Nakatsu, Yahiko Kambayashi, and Shuzo Yajima. A longest common subsequence algorithm for similar text strings. Acta Informatica, 18:171–179, 1982.

    Google Scholar 

  9. Wolfgang Obst. Delta technique and string-to-string correction. In Proc. of the First European Software Engineering Conference, pages 69–73. AFCET, Springer Verlag, September 1987.

    Google Scholar 

  10. Marc J. Rochkind. The source code control system. IEEE Transactions on Software Engineering, SE-1(4):364–370, December 1975.

    Google Scholar 

  11. Walter F. Tichy. The string-to-string correction problem with block moves. ACM Transactions on Computer Systems, 2(4):309–321, November 1984.

    Google Scholar 

  12. Walter F. Tichy. RCS — a system for version control. Software—Practice and Experience, 15(7):637–654, July 1985.

    Google Scholar 

  13. Kiem-Phong Vo. A prefix matching algorithm suitable for data compression. In preparation, 1995.

    Google Scholar 

  14. J. Ziv and A. Lempel. Compression of individual sequences via variable-rate coding. IEEE Trans. on Information Theory, IT-24(5):5306, September 1978.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Ian Sommerville

Rights and permissions

Reprints and permissions

Copyright information

© 1996 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hunt, J.J., Vo, K.P., Tichy, W.F. (1996). An empirical study of delta algorithms. In: Sommerville, I. (eds) Software Configuration Management. SCM 1996. Lecture Notes in Computer Science, vol 1167. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0023080

Download citation

  • DOI: https://doi.org/10.1007/BFb0023080

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-61964-2

  • Online ISBN: 978-3-540-49569-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics