Skip to main content

Structures for Large Data Sets

Encyclopedia of Big Data Technologies
  • 198 Accesses

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  • Bender M, Kuszmaul B (2013) Data structures and algorithms for big databases. In: 7th extremely large databases conference, Workshop, and Tutorials (XLDB), Stanford University, California

    Google Scholar 

  • Black P (2009) Hash table. In: Pieterse V, Black P (eds) Dictionary of algorithms and data structures. http://www.nist.gov/dads/HTML/hashtab.html

  • Black P (2014) Skip list. In: Pieterse V, Black P (eds) Dictionary of algorithms and data structures. https://www.nist.gov/dads/HTML/skiplist.html

  • Bloom B (1970) Space/time trade-offs in hash coding with allowable errors. Commun ACM 13(7):422–426

    Article  Google Scholar 

  • Boldi P, Rosa M, Vigna S (2011) HyperANF: approximating the neighbourhood function of very large graphs on a budget. In: Srinivasan S et al (eds) Proceedings of the 20th international conference on World Wide Web, March 2011, Hyderabad/India, p 625–634

    Google Scholar 

  • Bonomi F, Mitzenmacher M, Panigrahy R, Singh S, Varghese G (2006) An improved construction for counting Bloom filters. In: Azar Y, Erlebach T (eds) Algorithms – ESA 2006, the 14th annual european symposium on algorithms, September 2006, LNCS 4168, Zurich, Switzerland, p 684–695

    Google Scholar 

  • Broder A, Charikar M, Frieze A, Mitzenmacher M (1998) Min-wise independent permutations. In: Vitter J (eds) Proceedings of the thirtieth annual ACM symposium on the theory of computing, May 1998, Dallas, Texas, p 327–336

    Google Scholar 

  • Chen K, Jin P, Yue L (2014) A novel page replacement algorithm for the hybrid memory architecture involving PCM and DRAM. In: Hsu C et al (eds) Proceedings of the 11th IFIP WG 10.3 international conference on network and parallel computing, September 2014, Ilan, Taiwan, p 108–119

    Google Scholar 

  • Cooper B, Ramakrishnan R, Srivastava U, Silberstein A, Bohannon P, Jacobsen H, Puz N, Weaver D, Yerneni R (2008) PNUTS: Yahoo!’s hosted data serving platform. Proc VLDB Endowment 1(2):1277–1288

    Article  Google Scholar 

  • Cormen T, Leiserson C, Rivest R, Stein C (2009) Introduction to algorithms, 3rd edn. MIT Press, Boston, pp 253–280

    MATH  Google Scholar 

  • Das A, Datar M, Garg A, Rajaram S (2007) Google news personalization: scalable online collaborative filtering. In: Williamson C et al (eds) Proceedings of the 16th international conference on World Wide Web, May 2007, Banff, Alberta, p 271–280

    Google Scholar 

  • Graefe G (2004) Write-Optimized B-Trees. In: Nascimento M, Özsu M, Kossmann D, et al. (eds) Proceedings of the thirtieth international conference on very large data bases, Toronto, Canada, p 672–683

    Chapter  Google Scholar 

  • Henzinger M (2006) Finding near-duplicate web pages: a large-scale evaluation of algorithms, In: Efthimiadis E et al (eds) Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval, August 2006, Seattle, Washington, p 284–291

    Google Scholar 

  • Jin P, Yang P, Yue L (2015) Optimizing B+-tree for hybrid storage systems. Distrib Parallel Databases 33(3):449–475

    Article  Google Scholar 

  • Jin P, Yang C, Jensen C, Yang P, Yue L (2016) Read/write-optimized tree indexing for solid-state drives. VLDB J 25(5):695–717

    Article  Google Scholar 

  • Karger D, Lehman E, Leighton T, Panigrahy R, Levine M, Lewin D (1997) Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the world wide web. In: Leighton F et al (eds) Proceedings of the twenty-ninth Annual ACM symposium on the theory of computing, May 1997, El Paso, Texas, p 654–663

    Google Scholar 

  • Knuth D (1998) The art of computer programming. 3: sorting and searching, 2nd edn. Addison-Wesley, New York, pp 513–558

    Google Scholar 

  • Li X, Da Z, Meng X (2008) A new dynamic hash index for flash-based storage. In Jia Y et al (eds) Proceedings of the ninth international conference on web-age information management, July 2008, Zhangjiajie, China, p 93–98

    Google Scholar 

  • Li Y, He B, Yang J, Luo Q, Yi K (2010) Tree indexing on solid state drives. Proc VLDB Endowment 3(1):1195–1206

    Article  Google Scholar 

  • Li L, Jin P, Yang C, Wan S, Yue L (2016) XB+-tree: a novel index for PCM/DRAM-based hybrid memory. In: Cheema M et al (eds) Databases theory and applications – proceedings of the 27th Australasian database conference, September 2016, LNCS 9877, Sydney, Australia, p 357–368

    Google Scholar 

  • Liu L, Özsu M (2009) Encyclopedia of database systems. Springer, New York

    Book  Google Scholar 

  • Maggs B, Sitaraman R (2015) Algorithmic nuggets in content delivery. SIGCOMM Comput Commun Rev 45(3):52–66

    Article  Google Scholar 

  • O’Neil P, Cheng E, Gawlick D, O’Neil E (1996) The log-structured merge-tree (LSM-tree). Acta Informatica 33(4):351–385

    Article  Google Scholar 

  • Pournaras E, Warnier M, Brazier F (2013) A generic and adaptive aggregation service for large-scale decentralized networks. Complex Adapt Syst Model 1:19

    Article  Google Scholar 

  • Pugh W (1990) Skip lists: a probabilistic alternative to balanced trees. Commun ACM 33(6):668

    Article  Google Scholar 

  • Roh H, Kim W, Kim S, Park S (2009) A B-tree index extension to enhance response time and the life cycle of flash memory. Inf Sci 179(18):3136–3161

    Article  MathSciNet  Google Scholar 

  • Wang L, Wang H (2010) A new self-adaptive extendible hash index for flash-based DBMS. In Hao Y et al (eds) Proceedings of the 2010 IEEE international conference on information and automation, June 2010, Haerbin, China, p 2519–2524

    Google Scholar 

  • Wang J, Liu W, Kumar S, Chang S (2016) Learning to hash for indexing big data – a survey. Proc IEEE 104(1):34–57

    Article  Google Scholar 

  • Yang C, Lee K, Kim M, Lee Y (2009) An efficient dynamic hash index structure for NAND flash memory. IEICE Trans Fundam Electron Commun Comput Sci 92(7):1716–1719

    Article  Google Scholar 

  • Yang C, Jin P, Yue L, Zhang D (2016) Self-adaptive linear hashing for solid state drives. In Hsu M et al (eds) Proceedings of the 32nd IEEE international conference on data engineering, May 2016, Helsinki, Finland, p 433–444

    Google Scholar 

  • Yoo M, Kim B, Lee D (2012). Hybrid hash index for NAND flash memory-based storage systems. In: Lee S et al (eds) Proceedings of the 6th international conference on ubiquitous information management and communication, February 2012, Kuala Lumpur, Malaysia, p 55:1–55:5

    Google Scholar 

  • Zeinalipour-Yazti D, Lin S, Kalogeraki V, Gunopulos D, Najjar W (2005) MicroHash: an efficient index structure for flash-based sensor devices. In: Gibson G (eds) Proceedings of the FAST ‘05 conference on file and storage technologies, December 2005, San Francisco, California, p 1–14

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Jin, P. (2018). Structures for Large Data Sets. In: Sakr, S., Zomaya, A. (eds) Encyclopedia of Big Data Technologies. Springer, Cham. https://doi.org/10.1007/978-3-319-63962-8_168-1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-63962-8_168-1

  • Received:

  • Accepted:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-63962-8

  • Online ISBN: 978-3-319-63962-8

  • eBook Packages: Springer Reference MathematicsReference Module Computer Science and Engineering

Publish with us

Policies and ethics

Chapter history

  1. Latest

    Structures for Large Data Sets
    Published:
    08 July 2022

    DOI: https://doi.org/10.1007/978-3-319-63962-8_168-2

  2. Original

    Structures for Large Data Sets
    Published:
    07 June 2018

    DOI: https://doi.org/10.1007/978-3-319-63962-8_168-1