Skip to main content
Log in

Optimality in External Memory Hashing

  • Published:
Algorithmica Aims and scope Submit manuscript

Abstract

Hash tables on external memory are commonly used for indexing in database management systems. In this paper we present an algorithm that, in an asymptotic sense, achieves the best possible I/O and space complexities. Let B denote the number of records that fit in a block, and let N denote the total number of records. Our hash table uses \(1+O(1/\sqrt{B})\) I/Os, expected, for looking up a record (no matter if it is present or not). To insert, delete or change a record that has just been looked up requires \(1+O(1/\sqrt{B})\) I/Os, amortized expected, including I/Os for reorganizing the hash table when the size of the database changes. The expected external space usage is \(1+O(1/\sqrt{B})\) times the optimum of N/B blocks, and just O(1) blocks of internal memory are needed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Aggarwal, A., Vitter, J.S.: The input/output complexity of sorting and related problems. Commun. ACM 31(9), 1116–1127 (1988)

    Article  MathSciNet  Google Scholar 

  2. Brodal, G.S., Fagerberg, R.: Lower bounds for external memory dictionaries. In: Proc. 14th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 546–554 (2003)

  3. Broder, A.Z., Charikar, M., Frieze, A.M., Mitzenmacher, M.: Min-wise independent permutations. J. Comput. Syst. Sci. 60(3), 630–659 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  4. Cesarini, F., Soda, G.: A dynamic hash method with signature. ACM Trans. Database Syst. 16(2), 309–337 (1991)

    Article  Google Scholar 

  5. Gonnet, G.H., Larson, P.-Å.: External hashing with limited internal storage. J. Assoc. Comput. Mach. 35(1), 161–184 (1988)

    MathSciNet  Google Scholar 

  6. Kjellberg, P., Zahle, T.U.: Cascade hashing. In: Dayal, U., Schlageter, G., Lim, H.S. (eds.) Proceedings of 10th International Conference on Very Large Data Bases (VLDB), pp. 481–492. Morgan Kaufmann, Los Altos (1984)

    Google Scholar 

  7. Knuth, D.E.: The Art of Computer Programming, vol. 3. Sorting and Searching. Addison-Wesley, Reading (1973)

    Google Scholar 

  8. Koushik, M.: Dynamic hashing with distributed overflow space: a file organization with good insertion performance. Inf. Syst. 18(5), 299–318 (1993)

    Article  Google Scholar 

  9. Larson, P.-Å.: Linear hashing with partial expansions. In: Canadian Information Processing Society (ed.) Proceedings of 6th International Conference on Very Large Data Bases (VLDB), pp. 224–232. IEEE Comput. Soc. Press, Silver Spring (1980)

    Google Scholar 

  10. Larson, P.-Å.: Linear hashing with overflow-handling by linear probing. ACM Trans. Database Syst. 10(1), 75–89 (1985)

    Article  Google Scholar 

  11. Larson, P.-Å.: Performance analysis of a single-file version of linear hashing. Comput. J. 28(3), 319–329 (1985)

    Article  Google Scholar 

  12. Larson, P.-Å.: Dynamic hash tables. Commun. ACM 31(4), 446–457 (1988)

    Article  MathSciNet  Google Scholar 

  13. Litwin, W.: Linear hashing: a new tool for files and tables addressing. In: Proceedings of 6th International Conference on Very Large Data Bases (VLDB), pp. 212–223. IEEE Comput. Soc. Press, Silver Spring (1980)

    Google Scholar 

  14. Mullin, J.K.: Tightly controlled linear hashing without separate overflow storage. BIT 21(4), 390–400 (1981)

    Article  Google Scholar 

  15. Ramamohanarao, K., Lloyd, J.W.: Dynamic hashing schemes. Comput. J. 25(4), 479–485 (1982)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rasmus Pagh.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jensen, M.S., Pagh, R. Optimality in External Memory Hashing. Algorithmica 52, 403–411 (2008). https://doi.org/10.1007/s00453-007-9155-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00453-007-9155-x

Keywords

Navigation