ABSTRACT
This paper aims to present a case study related to the migration of a ROLAP architecture Data Warehouse of the University of Brasilia to an OLAP architecture in the NoSQL DB family of columns. This migration starts from the need to study new paradigms of architectures for decision support systems due to the new reality of the problems generated by the use of Big Data in the present moment at the university. We made two approaches: the first one by migrating to the Cassandra DBMS and the second one by migrating to Apache Hive. State of the art was made from Web of Science and we used the TEMAC methodology. We made a transformation of integrating all dimensions and table of facts into a single table for both Cassandra and Hive and we made a comparison by running two queries from these tables. Besides that, we also made a qualitative analysis, addressing the advantages and disadvantages of each approach. In the end, we concluded that Apache Hive should be the best choice for the University of Brasilia.
- Z. Bicevska and I. Oditis. Towards nosql-based data warehouse solutions. Procedia Computer Science, 104:104--111, 2017.Google ScholarDigital Library
- M. Boussahoua, O. Boussaid, and F. Bentayeb. Logical schema for data warehouse on column-oriented nosql databases. In International Conference on Database and Expert Systems Applications, pages 247--256. Springer, 2017.Google ScholarCross Ref
- J. Camacho-Rodríguez, A. Chauhan, A. Gates, E. Koifman, O. O'Malley, V. Garg, Z. Haindrich, S. Shelukhin, P. Jayachandran, S. Seth, et al. Apache hive: From mapreduce to enterprise-grade big data warehousing. arXiv preprint arXiv:1903.10970, 2019.Google Scholar
- R. Cattell. Scalable sql and nosql data stores. Acm Sigmod Record, 39(4):12--27, 2011.Google ScholarDigital Library
- L. Chao, C. Li, F. Liang, X. Lu, and Z. Xu. Accelerating apache hive with mpi for data warehouse systems. In 2015 IEEE 35th International Conference on Distributed Computing Systems, pages 664--673. IEEE, 2015.Google ScholarCross Ref
- S. Chaudhuri and U. Dayal. An overview of data warehousing and olap technology. ACM Sigmod record, 26(1):65--74, 1997.Google ScholarDigital Library
- M. Chevalier, M. El Malki, A. Kopliku, O. Teste, and R. Tournier. Benchmark for olap on nosql technologies. 2015.Google Scholar
- M. Chevalier, M. El Malki, A. Kopliku, O. Teste, and R. Tournier. How can we implement a multidimensional data warehouse using nosql? In International Conference on Enterprise Information Systems, pages 108--130. Springer, 2015.Google ScholarCross Ref
- M. Chevalier, M. El Malki, A. Kopliku, O. Teste, and R. Tournier. Implementation of multidimensional databases in column-oriented nosql systems. In East European Conference on Advances in Databases and Information Systems, pages 79--91. Springer, 2015.Google ScholarCross Ref
- M. Chevalier, M. El Malki, A. Kopliku, O. Teste, and R. Tournier. Document-oriented models for data warehouses. 2016.Google Scholar
- A. Cuzzocrea, L. Bellatreche, I.-Y. Song, et al. Data warehousing and olap over big data: current challenges and future research directions. In DOLAP, volume 13, pages 67--70, 2013.Google Scholar
- E. Dede, M. Govindaraju, D. Gunter, R. S. Canon, and L. Ramakrishnan. Performance evaluation of a mongodb and hadoop platform for scientific data analysis. In Proceedings of the 4th ACM workshop on Scientific cloud computing, pages 13--20. ACM, 2013.Google ScholarDigital Library
- K. Dehdouh. Building olap cubes from columnar nosql data warehouses. In International Conference on Model and Data Engineering, pages 166--179. Springer, 2016.Google ScholarCross Ref
- K. Dehdouh, F. Bentayeb, O. Boussaid, and N. Kabachi. Columnar nosql cube: Agregation operator for columnar nosql data warehouse. In 2014 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pages 3828--3833. IEEE, 2014.Google ScholarCross Ref
- K. Dehdouh, F. Bentayeb, O. Boussaid, and N. Kabachi. Using the column oriented nosql model for implementing big data warehouses. In Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA), page 469. The Steering Committee of The World Congress in Computer Science, Computer, 2015.Google Scholar
- M. Golfarelli, D. Maio, and S. Rizzi. The dimensional fact model: A conceptual model for data warehouses. International Journal of Cooperative Information Systems, 7(02n03):215--247, 1998.Google ScholarCross Ref
- Y. Hu, V. Y. Gunapati, P. Zhao, D. Gordon, N. R. Wheeler, M. A. Hossain, T. J. Peshek, L. S. Bruckman, G.-Q. Zhang, and R. H. French. A nonrelational data warehouse for the analysis of field and laboratory data from multiple heterogeneous photovoltaic test sites. IEEE Journal of Photovoltaics, 7(1):230--236, 2016.Google ScholarCross Ref
- R. Kimball and M. Ross. The data warehouse toolkit: The definitive guide to dimensional modeling. John Wiley & Sons, 2013.Google ScholarDigital Library
- N. Leavitt. Will nosql databases live up to their promise? Computer, 43(2):12--14, 2010.Google ScholarDigital Library
- Y. Liu and T. M. Vitolo. Graph data warehouse: Steps to integrating graph databases into the traditional conceptual structure of a data warehouse. In 2013 IEEE International Congress on Big Data, pages 433--434. IEEE, 2013.Google ScholarDigital Library
- K. Ma and R. Sun. Introducing websocket-based real-time monitoring system for remote intelligent buildings. International Journal of Distributed Sensor Networks, 9(12):867693, 2013.Google ScholarCross Ref
- K. Ma and B. Yang. Introducing extreme data storage middleware of schema-free document stores using mapreduce. International Journal of Ad Hoc and Ubiquitous Computing, 20(4):274--284, 2015.Google ScholarDigital Library
- K. Ma and B. Yang. Column access-aware in-stream data cache with stream processing framework. Journal of Signal Processing Systems, 86(2--3):191--205, 2017.Google ScholarDigital Library
- A. M. MARIANO and M. S. ROCHA. Revisão da literatura: Apresentação de uma abordagem integradora. In XXVI Congreso Internacional de la Academia Europea de Dirección y Economía de la Empresa (AEDEM), Reggio Calabria, volume 26, 2017.Google Scholar
- K. Psiuk-Maksymowicz, A. Płaczek, R. Jaksik, S. Student, D. Borys, D. Mrozek, K. Fujarewicz, and A. Świerniak. A holistic approach to testing biomedical hypotheses and analysis of biomedical data. In Beyond Databases, Architectures and Structures. Advanced Technologies for Data Mining and Knowledge Discovery, pages 449--462. Springer, 2015.Google Scholar
- P. J. Sadalage and M. Fowler. NoSQL essencial: um guia conciso para o mundo emergente da persistência poliglota. Novatec Editora, 2013.Google Scholar
- M. Y. Santos, B. Martinho, and C. Costa. Modelling and implementing big data warehouses for decision support. Journal of Management Analytics, 4(2):111--129, 2017.Google ScholarCross Ref
- S. Wang, I. Pandis, C. Wu, S. He, D. Johnson, I. Emam, F. Guitton, and Y. Guo. High dimensional biological data retrieval optimization with nosql technology. In BMC genomics, volume 15, page S3. BioMed Central, 2014.Google ScholarCross Ref
- R. Yangui, A. Nabli, and F. Gargouri. Automatic transformation of data warehouse schema to nosql data base: comparative study. Procedia Computer Science, 96:255--264, 2016.Google ScholarDigital Library
- H. Zhao and X. Ye. A practice of tpc-ds multidimensional implementation on nosql database systems. In Technology Conference on Performance Evaluation and Benchmarking, pages 93--108. Springer, 2013.Google Scholar
Index Terms
- ROLAP DW transformation proposal for OLAP architecture in NoSQL database
Recommendations
NOSOLAP: Moving from Data Warehouse Requirements to NoSQL Databases
ENASE 2019: Proceedings of the 14th International Conference on Evaluation of Novel Approaches to Software EngineeringTypical data warehouse systems are implemented either on a relational database or on a multi-dimensional database. While the former supports ROLAP operations the latter supports MOLAP. We explore a third alternative, that is, to implement a data ...
Model Transformation From Object Relational Database to NoSQL Document Database
NISS '19: Proceedings of the 2nd International Conference on Networking, Information Systems & SecurityWith the high increase of data growing, NoSQL databases play a key role in storing large amount of data, to gain flexibility and scalability many industries are now replacing their relational and object relational databases by adopting NoSQL database ...
Data adapter for querying and transformation between SQL and NoSQL database
As the growing of applications with big data in cloud computing become popular, many existing systems expect to expand their service to support the explosive increase of data. We propose a data adapter system to support hybrid database architecture ...
Comments