Skip to main content

HYPAD: Hyper-Graph-Driven Approach for Parallel Data Warehouse Design

  • Conference paper
  • First Online:
Algorithms and Architectures for Parallel Processing (ICA3PP 2015)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9531))

Abstract

Small, medium and large companies all face three well-identified problems, precisely: (i) the data deluge, (ii) the large number of interacted exploratory queries and (iii) the economic crisis. Hence, it becomes a real necessity to consider those problems and develop low-cost database deployment solutions. Data parallel architectures are one of the relevant deployment platforms that may manage efficiently this deluge of data. The process of designing such architecture has to integrate the interaction that may exist between queries. Although, the state-of-art on parallel data warehouses is quite rich, to the best of our knowledge, the query interaction is not highlighted. Amazingly, the queries are in the core of the parallel design. Ignoring their interaction may impact the quality of the final design. In this paper, we propose a new scalable hyper-graph approach, called HYPAD, for designing cluster warehouses by considering concurrent analytical highly interacted queries. Our approach is validated through a data warehouse cluster simulator. The obtained results show the effectiveness and efficiency of our proposal.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://glaros.dtc.umn.edu/gkhome/metis/hmetis/overview.

References

  1. Akal, F., Böhm, K., Schek, H.-J.: OLAP query evaluation in a database cluster: a performance study on intra-query parallelism. In: Manolopoulos, Y., Návrat, P. (eds.) ADBIS 2002. LNCS, vol. 2435, pp. 218–231. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  2. Bellatreche, L., Benkrid, S., Ghazal, A., Crolotte, A., Cuzzocrea, A.: Verification of partitioning and allocation techniques on teradata DBMS. In: 11th International Conference on Algorithms and Architectures for Parallel Processing (ICA3PP), pp. 158–169 (2011)

    Google Scholar 

  3. Bellatreche, L., Cuzzocrea, A., Benkrid, S.: Effectively and efficiently designing and querying parallel relational data warehouses on heterogeneous database clusters: the f&a approach. J. Database Manage. 23(4), 17–51 (2012)

    Article  Google Scholar 

  4. Bellatreche, L., Kerkad, A.: Query interaction based approach for horizontal data partitioning. Int. J. Data Warehouse. Min. (IJDWM) 11(2), 44–61 (2015)

    Article  Google Scholar 

  5. Benkrid, S., Bellatreche, L., Cuzzocrea, A.: A global paradigm for designing parallel relational data warehouses in distributed environments. T. Large Scale Data Knowl. Cent. Syst. 15, 64–101 (2014)

    Google Scholar 

  6. Boukorca, A., Bellatreche, L., Senouci, S.-A.B., Faget, Z.: SONIC: scalable multi-query optimization through integrated circuits. In: Decker, H., Lhotská, L., Link, S., Basl, J., Tjoa, A.M. (eds.) DEXA 2013, Part I. LNCS, vol. 8055, pp. 278–292. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  7. Cao, W., Yu, F., Xie, J.: Realization of the low cost and high performance mysql cloud database. Proc. VLDB Endow. 7(13), 1742–1747 (2014)

    Article  Google Scholar 

  8. Eavis, T.: Parallel and distributed data warehouses. In: Liu, L., Ozsu, T. (eds.) Encyclopedia of Database Systems, pp. 2012–2018. Springer, US (2009)

    Google Scholar 

  9. Eirinaki, M., Abraham, S., Polyzotis, N., Shaikh, N.: Querie: collaborative database exploration. IEEE Trans. Knowl. Data Eng. 26(7), 1778–1790 (2014)

    Article  Google Scholar 

  10. Goasdoué, F., Karanasos, K., Leblay, J., Manolescu, I.: View selection in semantic web databases. Proc. VLDB Endow. 5(2), 97–108 (2011)

    Article  Google Scholar 

  11. Karypis, G., Kumar, V.: Multilevel k-way hypergraph partitioning. In: ACM/IEEE Design Automation Conference (DAC), pp. 343–348. ACM (1999)

    Google Scholar 

  12. Mehta, M., Soloviev, V., DeWitt, D.J.: Batch scheduling in parallel database systems. In: Proceedings of the Ninth International Conference on Data Engineering, 19–23 April 1993, Vienna, Austria, pp. 400–410 (1993)

    Google Scholar 

  13. Menon, S.: Allocating fragments in distributed databases. IEEE Trans. Parallel Distrib. Syst. 16(7), 577–585 (2005)

    Article  Google Scholar 

  14. Mitchell, G.: Extensible query processing in an object-oriented database. Ph.D. thesis. Citeseer (1993)

    Google Scholar 

  15. O’Neil, P., O’Neil, B., Chen, X.: Star schema benchmark (2009)

    Google Scholar 

  16. Pavlo, A., Curino, C., Zdonik, S.: Skew-aware automatic database partitioning in shared-nothing, parallel oltp systems. In: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, pp. 61–72 (2012)

    Google Scholar 

  17. Phan, T., Li, W.-S.: Load distribution of analytical query workloads for database cluster architectures. In: 11th International Conference on Extending Database Technology (EDBT), pp. 169–180 (2008)

    Google Scholar 

  18. Saccà, D., Wiederhold, G.: Database partitioning in a cluster of processors. In: VLDB, pp. 242–247 (1983)

    Google Scholar 

  19. Sellis, T.K.: Multiple-query optimization. ACM Trans. Database Syst. 13(1), 23–52 (1988)

    Article  Google Scholar 

  20. Seshadri, S., Kumar, V., Cooper, B.F.: Optimizing multiple queries in distributed data stream systems. In: Proceedings International Conference on Data Engineering Workshops, p. 25. IEEE (2006)

    Google Scholar 

  21. Stöhr, T., Märtens, H., Rahm, E.: Multi-dimensional database allocation for parallel data warehouses. In: VLDB, pp. 273–284 (2000)

    Google Scholar 

  22. Thomas, D., Diwan, A.A., Sudarshan, S.: Scheduling and caching in multiquery optimization. In: Proceedings of the 13th International Conference on Management of Data (COMAD), pp. 150–153 (2006)

    Google Scholar 

  23. Yang, J., Karlapalem, K., Li, Q.: Algorithms for materialized view design in data warehousing environment. In: VLDB, pp. 136–145 (1997)

    Google Scholar 

  24. Zhu, H., Gu, P., Wang, J.: Shifted declustering: a placement-ideal layout scheme for multi-way replication storage architecture. In: Proceedings of the 22nd Annual International Conference on Supercomputing, ICS 2008, pp. 134–144 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Soumia Benkrid .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Boukorca, A., Bellatreche, L., Benkrid, S. (2015). HYPAD: Hyper-Graph-Driven Approach for Parallel Data Warehouse Design. In: Wang, G., Zomaya, A., Martinez, G., Li, K. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2015. Lecture Notes in Computer Science(), vol 9531. Springer, Cham. https://doi.org/10.1007/978-3-319-27140-8_53

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-27140-8_53

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-27139-2

  • Online ISBN: 978-3-319-27140-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics