Abstract
Thanks to their spatially distributed sensors, cyber-physical system (CPS) applications are currently collecting large amounts of heterogeneous data. When it comes to allowing several decision-makers to collaboratively plan their actions, these applications need appropriate tools for an efficient storage, analysis, and visualization of the available data. Spatial data warehouses (SDWs) have proven their efficiency in carrying out these operations. However, because of the increasing volumes of data, the commonly used spatial extract-transform-load (SETL) process generally fails to update the SDW within acceptable timeframes. In order to solve this problem, we propose to perform the SETL tasks in a distributed, parallel manner by means of a grid of computing resources. In addition to being the unique solution that uses grid computing for the SETL process of SDWs, our solution makes use of cloud computing techniques to shorten the spatial data processing time and reduce resource consumption. To meet our goals, we propose a multi-agent-based solution to adequately schedule and balance the processing activities over the grid while allowing a joint use of real-time and archive data for personalized reporting and visualization of services envisioned to the decision-makers who are using the same CPS application.
Similar content being viewed by others
References
Ablimit A, Fusheng W, Hoang V, Rubao L, Qiaoling L, Xiaodong Z, Joel S (2013) Hadoop-GIS: a high performance spatial data warehousing system over MapReduce. In: Proceedings of the 39th International Conference on Very Large Databases (VLDB’2013), pp 1009–1020
Bala M, Alimazighi Z (2012) ETL-X design: Outil d’aide à la modélisation de processus ETL. In: Proceedings of 6éme édition des Avancées sur les Systèmes Décisionnels, pp 155–166
Bala M, Boussaid O, Alimazighi Z, Bentayeb F (2014) PF-ETL: vers l’intégration de données massives dans les fonctionnalités. Proc INFORSID 2014:61–76
Bandyopadhyay S, Coyle EJ (2013) An energy efficient hierarchical clustering algorithm for wireless sensor networks. In: Proceedings of INFOCOM, pp 1713–1723
Bédard Y, Han J (2009) Fundamentals of spatial data warehousing for geographic knowledge discovery. In: Miller HJ, Han J (eds) Geographic data mining and knowledge discovery, 2nd edn. Taylor & Francis, pp 53–73
Bernier E, Bédard Y (2007) A data warehouse strategy for on-demand multiscale mapping. In: Mackaness WA, Ruas A, Sarjakoski LT (eds) Generalisation of geographic information: cartographic modeling and applications. Amsterdam, pp 177–198
Butte B (2004) Solving the data Warehouse dilemma With grid technology, IBM Global Services. http://csis.bits-pilani.ac.in/faculty/goel/course_material/Data%20Warehousing/I%20sem%202005-06/Assignemt%202/GW510-5041-00F.pdf. Accessed 20 March 2015
Costa R, Furtado P (2008) Optimizer and QoS for the community data warehouse architecture. In: Zakrzewska D, Menasalvas E, Byczkowska-Lipiñska L (eds) New trends in database systems: methods, tools, applications. Springer-Verlag
Demiya T, Yoshihisa T, Kanazawa M (2008) Compact grid: a grid computing system using low resource compact computers. J Commun Netw Distrib 1:112–117
Eckerson W, White C (2003) Evaluating ETL and data integration platforms. Technical report, The Data Warehousing Institute
FME (2015) Safe software FME workbanch. http://www.safe.com/. Accessed 5 Dec 2015
Foster I, Kesselman C, Tuecke S (2001) The anatomy of the grid: enabling scalable virtual organizations. J High Perform Comput Appl 15:200–222
GeoKettle (2015) http://www.spatialytics.org/projects/geokettle/. Accessed 5 Dec 2015
Helmy T, Al-Jamimi H, Ahmed B, Loqman H (2012) Fuzzy logic-based scheme for load balancing in grid services. J Softw Eng Appl 5:149–156. doi:10.4236/jsea.2012.512b029
Just VB (2013) NSPIRE Transformation with Stetl: a lightweight python framework for geospatial ETL. In: Proceedings of KEN Workshop
Kumar S, Singhal N (2012) A priority based dynamic load balancing approach in a grid based distributed computing network. J Comput Appl 49:511–514
Liu D (2014) A fault-tolerant architecture for ROIA in cloud. J Ambient Intell Humaniz Comput 6:587–595. doi:10.1007/s12652-014-0220-4
Liu X, Thomsen C, Pedersen TB (2011) ETLMR: a highly scalable dimensional ETL framework based on Mapreduce. In: Proceedings of 13th International Conference on Data Warehousing and Knowledge, pp 96–111
Malinowski E, Zimányi E (2008) Advanced data warehouse design: from conventional to spatial and temporal applications. Springer-Verlag
Marey O, Bentahar J, Khosrowshahi-Asl E, Sultan K, Dssouli R (2015) Decision making under subjective uncertainty in argumentation-based agent negotiation. J Ambient Intell Humaniz Comput 6(3):307–323
Martel C (1999) Développement d’un cadre théorique pour la gestion des représentations multiples dans les bases de données spatiales. Université Laval, Mémoire de maîtrise
Misra S, Saha SK, Mazumdar C (2013) Performance comparison of Hadoop based tools with commercial ETL tools—a case study. In: Proceedings of Big Data Analytics (BDA’13), pp 176–184
Nudd G, Kerbyson D, Papaefstathiou E, Perry S, Harper J, Wilcox D (2010) Pace—a toolset for the performance prediction of parallel and distributed systems. J High Perform Comput Appl 14(3):228–251
Patroumpas K, Alexakis Giannopoulos MG, Athanasiou S (2014) TripleGeo: an ETL tool for transforming geospatial data into RDF triples. In: Proceedings of the EDBT/ICDT 2014 Joint Conference, pp 275–278
Rajkumar R, Lee I, Sha L, Stankovic J (2010) Cyber-physical systems: the next computing revolution. In: Proceedings of the 47th Design Automation Conference, pp 731–736
Salehi M, Bédard Y, Rivest S (2010) A formal conceptual model and definition framework for spatial datacubes. Geomatica 64:119–129
Santos V, Oliveira B, Silva R, Belo O (2012) Configuring and executing ETL tasks on grid environments—requirements and specificities. In: Proceedings of First World Conference on Innovation and Computer Sciences (INSODE 2011), pp 112–117
Spatial extension for Talend (2015) http://talend-spatial.github.io/. Accessed 05 Dec 2015
Stefanovic N, Han J, Koperski JK (2000) Object-based selective materialization for efficient implementation of spatial data cubes. IEEE Trans Knowl Data Eng 12:938–958
Tekadpande S, Deshpande L (2015) Analysis and design of ETL process using Hadoop. J Eng Innov Technol (IJEIT) 4(4):144–159
Thirumala RB, Reddy LSS (2011) Survey on improved scheduling in hadoop MapReduce in cloud environments. J Comput Appl 34(9):29–33
Trujillo, Luján-Mora JS (2003) A UML based approach for modeling ETL processes in data warehouses. In: Proceedings of 22nd International Conference on Conceptual Modeling (ER 2003), pp 307–320
Tziovara V, Vassiliadis P, Simitsis (2007) Deciding the physical implementation of ETL workflows. In: Proceedings of ACM 10th International Workshop on Data Warehousing and OLAP (DOLAP 2007), pp 49–56
Vassiliadis P (2009) A survey of extract–transform–load technology. J Data Warehous Min 5(3):1–27
Vassiliadis P, Simitsis A, Georgantas P, Terrovitis M (2003) A framework for the design of ETL scenarios. In: Proceedings of 15th Conference on Advanced Information Systems Engineering (CAiSE 2003), pp 520–535
Vassiliadis P, Simitsis A, Georgantas P, Terrovitis M, Skiadopoulos S (2005) A generic and customizable framework for the design of ETL scenarios. Inform Syst 30(7):492–525
Wehrle P, Miquel M, Tchounikine A (2007) A grid services-oriented architecture for efficient operation of distributed data warehouses on globus. In: Proceedings of Advanced Information Networking and Applications (AINA’07), pp 994–999
Xi-qian C, Zhong-xian C, Xiu-kun CA (2004) Applying DP to ETL of spatial data warehouse. In: Proceedings of the Third International Conference on Machine Learning and Cybenetics, pp 26–29
Xue S, Xiong L, Yang S, Zhao L (2016) A self-adaptive multi-view framework for multi-source information service in cloud ITS. J Ambient Intell Human Comput 7(2):205–220
Zode M (2008) Grids in data warehouses, http://www.tdan.com/view-articles/9378. Accessed 25 April 2015
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors, Boubaker Boulekrouche, Nafaâ Jabeur, and Zaia Alimazighi, declare that there is no conflict of interests regarding the publication of this paper.
Additional information
This paper is an extended version of a paper which previously appeared in the Proceedings of the 12th International Conference on Mobile Systems and Pervasive Computing in 2015.
Rights and permissions
About this article
Cite this article
Boulekrouche, B., Jabeur, N. & Alimazighi, Z. Toward integrating grid and cloud-based concepts for an enhanced deployment of spatial data warehouses in cyber-physical system applications. J Ambient Intell Human Comput 7, 475–487 (2016). https://doi.org/10.1007/s12652-016-0376-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-016-0376-1