Skip to main content
Log in

A New Approach Based on ELK Stack for the Analysis and Visualisation of Geo-referenced Sensor Data

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

This paper examines the use of Elasticsearch for data warehousing and analyses of geo-referenced sensor data. Elasticsearch has several advantages compared to its direct competitors. For example, it is capable of handling time series, spatial data, and objects. Moreover, it is natively connected with the data shippers Beats, Logstash, and the visualisation tool Kibana. This paper proposes a method to implement and query multidimensional models in Elasticsearch. No prior work has evaluated Elasticsearch for data warehouses and analytical queries, especially for sensor environmental data. This paper therefore also presents extensive experiments to evaluate its querying performance. The proposed approach is applied to the analysis of sensor data used in the context of CEBA, an environmental cloud solution developed to collect, store, and analyse environmental data. An experimental performance analysis is also provided.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

Data availability

The data that support the study are available in Google Drive at https://drive.google.com/drive/folders/1ATdzq_p-jwrhPLyWkrfE_O8nCE8LD6s4?usp=sharing.

Notes

  1. https://github.com/AnnaNgo13/es_etl.

References

  1. ConnecSenS P. 2015–2020. http://www.lpc-clermont.in2p3.fr/spip.php?article583. Retrieved June 2021.

  2. Terray LA-J. From sensor to cloud: an IoT network of radon outdoor probes to monitor active volcanoes. Sensors. 2020; pp. 2755 (Multidisciplinary Digital Publishing Institute).

  3. Bajer M. Building an IoT data hub with Elasticsearch, Logstash and Kibana. In: 5th international conference on future internet of things and cloud workshops (FiCloudW). IEEE. 2017. pp. 63–8.

  4. Inmon WH. Building the data warehouse. New York: Wiley; 2005.

    Google Scholar 

  5. Jarke MA. Fundamentals of data warehouses. New York: Springer; 2002.

    MATH  Google Scholar 

  6. Pinet FA. Precise design of environmental data warehouses. Oper Res. 2010; vol. 10. pp. 349–369.

  7. Bicevska ZA. Towards NoSQL-based data warehouse solutions. Procedia Comput Sci. 2017; vol. 104. pp. 104–111.

  8. Lenzerini M. Data integration: a theoretical perspective. In: Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems. 2002. pp. 233–46.

  9. Sabtu AA. The challenges of extract, transform and loading (etl) system implementation for near real-time environment. In: 2017 international conference on research and innovation in information systems (ICRIIS). IEEE. 2017.

  10. Pilato D. How to fetch data from multiple index using join like sql. Retrieved from Elasticsearch. 2017. https://discuss.elastic.co/t/how-to-fetch-data-from-multiple-index-using-join-like-sql/106131. Retrieved June 2021.

  11. Bansal SK. Integrating big data: A semantic extract-transform-load framework. In: Computer. IEEE. 2015. pp. 42–50.

  12. Elasticsearch. (2020). ELK. https://www.elastic.co/elastic-stack. Retrieved June 2021.

  13. Guo DA. State-of-the-art geospatial information processing in NoSQL databases. ISPRS Int J Geo-inf. 2020; pp. 331 (Multidisciplinary Digital Publishing Institute).

  14. Dubey SA. Data visualization on GitHub repository parameters using Elastic search and Kibana. In: 2018 2nd international conference on trends in electronics and informatics (ICOEI). IEEE. 2018. pp. 554–8.

  15. Nipun Garg SM. Spatial databases spatial data warehouses. Retrieved from pdfs.semanticscholar.org. 2011. https://pdfs.semanticscholar.org/684a/4a2c41360e5965281ee09cabbb621f4400cb.pdf. Retrieved June 2021.

  16. Matei AA-M. OLAP for multidimensional semantic web databases. Enabl Real Time Bus Intell. 2014;81–96.

  17. Wrembel R. Data warehouses and OLAP: concepts, architectures and solutions: concepts, architectures and solutions. Igi Global. 2006.

  18. Albrecht AA. Managing ETL processes. NTII. 2008;8:12–5.

    Google Scholar 

  19. CEBA project. 2020–2025. https://mesocentre.uca.fr/projets-associes/ceba. Retrieved June 2021.

  20. Werneck GL. Georeferenced data in epidemiologic research. Ciencia Sa’ude Coletiva. 2008;13:1753–66.

    Article  Google Scholar 

  21. Alam MM. A survey on spatio-temporal data analytics systems. 2021. arXiv:2103.09883.

  22. Hintze PA. Geographically referenced data for social science. RatSWD_WP_. 2009.

  23. Lee J-GA. Geospatial big data: challenges and opportunities. Big Data Res. 2015;2:74–81.

    Article  Google Scholar 

  24. Kulsawasd Jitkajornwanich NP. A survey on spatial, temporal, and spatio-temporal database research and an original example of relevant applications using SQL ecosystem and deep learning. J Inf Telecommun. 2020;4(4):524–59.

    Google Scholar 

  25. Elasticsearch. Scalability and resilience: clusters, nodes, and shards. Retrieved from Elasticsearch. 2021. https://www.elastic.co/guide/en/elasticsearch/reference/current/scalability.html. Retrieved June 2021.

  26. Tewtia HK. COVID-19 insightful data visualization and forecasting using elasticsearch. In: Computational intelligence methods in COVID-19: surveillance, prevention, prediction and diagnosis. Springer. 2021. pp. 191–205.

  27. CEBA. CAHIER DES CHARGES BASE DE DONNEES. 2018. http://doc.ceba.uca.fr. Retrieved June 2021.

  28. Elasticsearch. Creating a visualization. 2021. https://www.elastic.co/guide/en/kibana/6.8/createvis.html. Retrieved June 2021.

  29. Bedard YA. Fundamentals of spatial data warehousing for geographic knowledge discovery. Geogr Data Min Knowl Discov. 2001;2:53–73.

    Google Scholar 

  30. Barnsteiner F. Elasticsearch as a time series data store. 2015. https://www.elastic.co/blog/elasticsearch-as-a-time-series-data-store. Retrieved June 2021.

  31. Ngo TT-A. An analytical tool for georeferenced sensor data based on ELK stack. In: Proceedings of the 7th international conference on geographical information systems theory, applications and management (GISTAM 2021). SCITEPRESS—Science and Technology Publications, Lda. 2021. pp. 82–89.

  32. Kramer M. GeoRocket: a scalable and cloud-based data store for big geospatial files. SoftwareX. Elsevier. 2020. p. 100409.

  33. Bartlett R. Local geographic information storing and querying using elasticsearch. In: Proceedings of the 13th workshop on geographic information retrieval. pp. 1–4. 2019.

  34. Quoc HN. An elastic and scalable spatiotemporal query processing for linked sensor data. In: Proceedings of the 11th international conference on semantic systems. pp. 17–24. 2015.

  35. Dobson SA. A reference architecture and model for sensor data warehousing. IEEE Sens J. 2018;18:7659–70 (IEEE).

    Article  Google Scholar 

  36. PostGIS. Chapter 15. PostGIS Special Functions Index. 2022. https://postgis.net/docs/PostGIS_Special_Functions_Index.html. Retrieved 4 2022.

  37. Agarwal SA. Performance analysis of MongoDB versus PostGIS/PostGreSQL databases for line intersection and point containment spatial queries. Spat Inf Res. 2016;24:671–7.

    Article  Google Scholar 

  38. Bartoszewski DA. The comparison of processing efficiency of spatial data for PostGIS and MongoDB databases. In: International conference: beyond databases, architectures and structures. Springer. 2019. pp. 291–302.

  39. Bimonte SA. When spatial analysis meets OLAP: multidimensional model and operators. Int J Data Warehous Min. 2010;6:33–60.

    Article  Google Scholar 

  40. Boulil KA. A UML & spatial OCL based approach for handling quality issues in SOLAP systems. In I. (1) (ed.). pp. 99–104. 2012.

  41. Boulil KA. Spatial OLAP integrity constraints: from UML-based specification to automatic implementation: application to energetic data in agriculture. J Decis Syst. 2014;23:460–80.

    Article  Google Scholar 

  42. Boulil KA-P. Guaranteeing the quality of multidimensional analysis in data warehouses of simulation results: application to pesticide transfer data produced by the MACRO model. Ecol Inform. 2013;16:41–52.

    Article  Google Scholar 

  43. Miralles AA. EIS pesticide: an information system for data and knowledge capitalization and analysis. In: Euraqua-peer scientific conference. 2011.

  44. Liang SA-Y. OGC SensorThings API part 1: sensing, version 1.0. Open geospatial consortium. 2016.

  45. ISO 19156:2011. From International Organization for Standardization, ISO 19156:2011, geographic information—observation & measurement. 2011. https://www.iso.org/standard/32574.html. Retrieved Mar 2022.

  46. ISO 19115-1:2014. From geographic information—metadata—part 1: fundamentals. 2014. https://www.iso.org/standard/53798.html. Retrieved Mar 2022.

  47. Geonetwork. 2022. https://geonetwork-opensource.org/. Retrieved Mar 2022.

Download references

Acknowledgements

This research was financed by the French government IDEX-ISITE initiative 16-IDEX-0001 (CAP 20-25) and the PhD was funded by the European Regional Development Fund (FEDER).

Funding

This research was financed by the French government IDEX-ISITE initiative 16-IDEX-0001 (CAP 20-25) and the Ph.D. was funded by the European Regional Development Fund (FEDER).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thi Thu Trang Ngo.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Geographical Information Systems Theory, Applications and Management” guest edited by Lemonia Ragia, Cédric Grueau and Robert Laurini.

Appendices

Appendix A

See Figs. 20, 21, 22 and 23.

Fig. 20
figure 20

Benchmark dashboard—visualisation 1

Fig. 21
figure 21

Benchmark dashboard—visualisation 2

Fig. 22
figure 22

Benchmark dashboard—visualisation 3

Fig. 23
figure 23

Benchmark dashboard—visualisation 4

Appendix B

We can visualise our dataset in many forms, e.g. bar charts and line graphs. In this part, we explain how to produce a visualisation on Kibana (Elasticsearch, Creating a Visualization, 2021).

  • Navigate to the visualisation page by clicking on Visualise on the left panel on Kibana home page.

  • Select a visualisation type, e.g. line, area, or maps.

  • Select the expected dataset index.

A metric and bucket aggregation query panel will be displayed by default as in Fig. 24.

Fig. 24
figure 24

Kibana—visualisation controller in default mode

In the visualisation controller, the metrics in Fig. 25 are represented for the visualisation of the Y axis and the buckets in Fig. 26 are represented for the visualisation of the X axis. Consider the visualisation 3 (see Fig. 22) as a use case example to monitor air humidity measurement by devices and hours: the Y axis shows the value of air humidity received from each sensor, and the device name and hours are displayed across the X axis.

Fig. 25
figure 25

Kibana—visualisation metric controller

Fig. 26
figure 26

Kibana—visualisation bucket controller

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ngo, T.T.T., Sarramia, D., Kang, MA. et al. A New Approach Based on ELK Stack for the Analysis and Visualisation of Geo-referenced Sensor Data. SN COMPUT. SCI. 4, 241 (2023). https://doi.org/10.1007/s42979-022-01628-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-022-01628-6

Keywords

Navigation