Skip to main content

A Universal Distributed Indexing Scheme for Data Centers with Tree-Like Topologies

  • Conference paper
  • First Online:
Database and Expert Systems Applications (Globe 2015, DEXA 2015)

Abstract

The indices in the distributed storage systems manage the stored data and support diverse queries efficiently. Secondary index, the index built on the attributes other than the primary key, facilitates a variety of queries for different purposes. In this paper, we propose U2-Tree, a universal distributed secondary indexing scheme built on cloud storage systems with tree-like topologies. U2-Tree is composed of two layers, the global index and the local index. We build the local index according to the local data features, and then assign the potential indexing range of the global index for each host. After that, we use several techniques to publish the meta-data about local index to the global index host. The global index is then constructed based on the collected intervals. We take advantage of the topological properties of tree-like topologies, introduce and compare the detailed optimization techniques in the construction of two-layer indexing scheme. Furthermore, we discuss the index updating, index tuning, and the fault tolerance of U2-Tree. Finally, we propose numerical experiments to evaluate the performance of U2-Tree. The universal indexing scheme provides a general approach for secondary index on data centers with tree-like topologies. Moreover, many techniques and conclusions can be applied to other DCN topologies.

This work has been supported in part by the National Natural Science Foundation of China (Grant number 61202024, 61472252, 61133006, 61422208), China 973 project (2014CB340303), the Natural Science Foundation of Shanghai (Grant No.12ZR1445000), Shanghai Educational Development Foundation (Chenguang Grant No.12CG09), Shanghai Pujiang Program 13PJ1403900, and in part by Jiangsu Future Network Research Project No. BY2013095-1-10 and CCF-Tencent Open Fund.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Al-Fares, M., Loukissas, A., Vahdat, A.: A scalable, commodity data center network architecture. ACM SIGCOMM Comput. Commun. Rev. 38(4), 63–74 (2008)

    Article  Google Scholar 

  2. Bentley, J.L.: Solutions to klee’s rectangle problems. Technical report, Carnegie-Mellon University, Pittsburgh (1977)

    Google Scholar 

  3. Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: a distributed storage system for structured data. ACM Trans. Comput. Syst. 26(2), 4 (2008)

    Article  Google Scholar 

  4. Chen, G., Vo, H.T., Wu, S., Ooi, B.C., Özsu, M.T.: A framework for supporting DBMS-like indexes in the cloud. VLDB. 4, 702–713 (2011)

    Google Scholar 

  5. DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., Vogels, W.: Dynamo: amazon’s highly available key-value store. ACM SIGOPS Operating Syst. Rev. 41(6), 205–220 (2007)

    Article  Google Scholar 

  6. Edelsbrunner, H.: Dynamic data structures for orthogonal intersection queries. Technical report, TU Graz (1980)

    Google Scholar 

  7. Gao, L., Zhang, Y., Gao, X., Chen, G.: Indexing multi-dimension data in modular data centers. In: DEXA (2015)

    Google Scholar 

  8. Gao, X., Li, B., Chen, Z., Yin, M., Chen, G., Jin, Y.: FT-INDEX: A distributed indexing scheme for switch-centric cloud storage system. In: ICC (2015)

    Google Scholar 

  9. Greenberg, A., Hamilton, J.R., Jain, N., Kandula, S., Kim, C., Lahiri, P., Maltz, D.A., Patel, P., Sengupta, S.: VL2: a scalable and flexible data center network. ACM SIGCOMM Comput. Commun. Rev. 39(4), 51–62 (2009)

    Article  Google Scholar 

  10. Guo, C., Lu, G., Li, D., Wu, H., Zhang, X., Shi, Y., Tian, C., Zhang, Y., Lu, S.: Bcube: a high performance, server-centric network architecture for modular data centers. ACM SIGCOMM Comput. Commun. Rev. 39(4), 63–74 (2009)

    Article  Google Scholar 

  11. Guo, C., Wu, H., Tan, K., Shi, L., Zhang, Y., Lu, S.: Dcell: a scalable and fault-tolerant network structure for data centers. ACM SIGCOMM Comput. Commun. Rev. 38(4), 75–86 (2008)

    Article  Google Scholar 

  12. Guo, D., Chen, T., Li, D., Liu, Y., Liu, X., Chen, G.: BCN: Expansible network structures for data centers using hierarchical compound graphs. In: INFOCOM, pp. 61–65. IEEE (2011)

    Google Scholar 

  13. Lakshman, A., Malik, P.: Cassandra: a decentralized structured storage system. ACM SIGOPS Operating Syst. Rev. 44(2), 35–40 (2010)

    Article  Google Scholar 

  14. Li, F., Liang, W., Gao, X., Yao, B., Chen, G.: Efficient R-tree based indexing for cloud storage system with dual-port servers. In: DEXA, pp. 375–391 (2014)

    Google Scholar 

  15. Lu, P., Wu, S., Shou, L., Tan, K.L.: An efficient and compact indexing scheme for large-scale data store. In: ICDE, pp. 326–337 (2013)

    Google Scholar 

  16. McCreight, E.M.: Efficient algorithms for enumerating intersection intervals and rectangles. Technical report, Xerox Paolo Alto Reserach Center (1980)

    Google Scholar 

  17. McCreight, E.M.: Priority search trees. SIAM J. Comput. 14(2), 257–276 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  18. Mysore, R.N., Pamboris, A., Farrington, N., Huang, N., Miri, P., Radhakrishnan, S., Subramanya, V., Vahdat, A.: Portland: a scalable fault-tolerant layer 2 data center network fabric. ACM SIGCOMM Comput. Commun. Rev. 39(4), 39–50 (2009)

    Article  Google Scholar 

  19. Singla, A., Hong, C.Y., Popa, L., Godfrey, P.B.: Jellyfish: networking data centers randomly. In: NSDI. vol. 12, p. 17 (2012)

    Google Scholar 

  20. Walraed-Sullivan, M., Vahdat, A., Marzullo, K.: Aspen trees: balancing data center fault tolerance, scalability and cost. In: CoNEXT, pp. 85–96 (2013)

    Google Scholar 

  21. Wang, J., Wu, S., Gao, H., Li, J., Ooi, B.C.: Indexing multi-dimensional data in a cloud system. In: SIGMOD, pp. 591–602 (2010)

    Google Scholar 

  22. Wu, S., Jiang, D., Ooi, B.C., Wu, K.L.: Efficient B-tree based indexing for cloud data processing. VLDB 3, 1207–1218 (2010)

    Google Scholar 

  23. Wu, S., Wu, K.L.: An indexing framework for efficient retrieval on the cloud. IEEE Data Eng. Bull. 32(1), 75–82 (2009)

    Google Scholar 

  24. Zhang, R., Qi, J., Stradling, M., Huang, J.: Towards a painless index for spatial objects. ACM Trans. Database Syst. 39(3), 19 (2014)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaofeng Gao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Liu, Y., Gao, X., Chen, G. (2015). A Universal Distributed Indexing Scheme for Data Centers with Tree-Like Topologies. In: Chen, Q., Hameurlain, A., Toumani, F., Wagner, R., Decker, H. (eds) Database and Expert Systems Applications. Globe DEXA 2015 2015. Lecture Notes in Computer Science(), vol 9261. Springer, Cham. https://doi.org/10.1007/978-3-319-22849-5_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-22849-5_33

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-22848-8

  • Online ISBN: 978-3-319-22849-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics