Skip to main content
Log in

Merging sorted runs using large main memory

  • Published:
Acta Informatica Aims and scope Submit manuscript

Summary

External sorting is usually accomplished by first creating sorted runs, then merging the runs. In the merge phase, writing and calculating can be overlapped by reading if two input buffers are used for each sorted run. If the memory is very large, the input buffers will be large and using two input buffers per sorted run will be more efficient than using only one input buffer per run and risking reduced overlap of reading and writing. In many cases, merging time can be cut in half. We derive a formula for estimating the total time for merging for a given memory size, file size, number of merging passes and for a given disk drive. We present an extreme example where in spite of having two buffers per run, significant non-overlap occurs. However, in realistic problems, we show that making one merge pass with two input buffers per run is near optimal. This contradicts earlier results on merging which do not take large memory into account.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Aggarwal, A., Vitter, J.S.: The input/output complexity of sorting and related problems. CACM31, 1116–1127 (1988)

    Google Scholar 

  2. Beck, M., Bitton, D., Wilkinson, M.K.: Sorting large files on a backend multiprocessor. IEEE Trans. Comput.37, 769–778 (1988)

    Google Scholar 

  3. DeWitt, D.J., Katz, R.H., Olken, D., Shapiro, L.D., Stonebraker, M.H., Wood, D.: Implementation techniques for main memory database systems. Proc. SIGMOD, pp. 1–8 (1984)

  4. DISK/TREND Report 1986

  5. Knuth, D.: The art of computer programming, Vol.3. Sorting and searching. Reading, MA: Addison-Wesley 1973

    Google Scholar 

  6. Kwan, S.C., Baer, J.L.: The I/O performance of multiway mergesort and tag sort. IEEE Trans. Comput. C-34 Special Issue on SortingC34, 383–387 (1985)

    Google Scholar 

  7. Salzberg, B.: File structures: An analytic approach. Englewood Cliffs, N.J: Prentice-Hall 1988

    Google Scholar 

  8. Shapiro, L.D.: Join processing in database systems with large main memories. ACM Trans. Database Syst.11, 239–264 (1986)

    Google Scholar 

  9. Tsukerman, A., Gray, J., Stewart, M., Uren, S., Vaughan, B.: Fast sort: An external sort using parallel processing. Tandem Technical Report 86.3, Cupertino, CA, May 1986

  10. Wiederhold, G.: Database design, 2nd Ed. New York, NY: McGraw Hill 1983

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Salzberg, B. Merging sorted runs using large main memory. Acta Informatica 27, 195–215 (1989). https://doi.org/10.1007/BF00572988

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00572988

Keywords

Navigation