Skip to main content
Log in

A framework for social media data analytics using Elasticsearch and Kibana

  • Published:
Wireless Networks Aims and scope Submit manuscript

Abstract

Real-time online data processing is quickly becoming an essential tool in the analysis of social media for political trends, advertising, public health awareness programs and policy making. Traditionally, processes associated with offline analysis are productive and efficient only when the data collection is a one-time process. Currently, cutting edge research requires real-time data analysis that comes with a set of challenges, particularly the efficiency of continuous data fetching within the context of present NoSQL and relational databases. In this paper, we demonstrate a solution to effectively adsress the challenges of real-time analysis using a configurable Elasticsearch search engine. We are using a distributed database architecture, pre-build indexing and standardizing the Elasticsearch framework for large scale text mining. The results from the query engine are visulized in almost real-time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Cervellini, P., Menezes, A. G., & Mago, V. K. (2016). Finding trendsetters on yelp dataset. In 2016 IEEE symposium series on computational intelligence (SSCI) (pp. 1–7). IEEE.

  2. Belyi, E., Giabbanelli, P. J., Patel, I., Balabhadrapathruni, N. H., Abdallah, A. B., Hameed, W., et al. (2016). Combining association rule mining and network analysis for pharmacosurveillance. The Journal of Supercomputing, 72(5), 2014–2034.

    Article  Google Scholar 

  3. Kononenko, O., Baysal, O., Holmes, R., & Godfrey, M. W. (2014). Mining modern repositories with Elasticsearch. In Proceedings of the 11th working conference on mining software repositories (pp. 328–331). ACM.

  4. Liu, Q., Kumar, S., & Mago, V. (2017). Safernet: Safe transportation routing in the era of internet of vehicles and mobile crowd sensing. In 2017 14th IEEE annual consumer communications and networking conference (CCNC) (pp. 299–304). IEEE.

  5. Kim, M. G., & Koh, J. H. (2016). Recent research trends for geospatial information explored by twitter data. Spatial Information Research, 24(2), 65–73.

    Article  Google Scholar 

  6. Assunção, M. D., Calheiros, R. N., Bianchi, S., Netto, M. A., & Buyya, R. (2015). Big data computing and clouds: Trends and future directions. Journal of Parallel and Distributed Computing, 79, 3–15.

    Article  Google Scholar 

  7. Bsch, C., Hartel, P., Jonker, W., & Peter, A. (2014). A survey of provably secure searchable encryption. ACM Computing Surveys, 47(2), 18:1–18:51. https://doi.org/10.1145/2636328.

    Article  Google Scholar 

  8. Kumar, P., Kumar, P., Zaidi, N., & Rathore, V. S. (2018). Analysis and comparative exploration of elastic search, Mongodb and Hadoop big data processing. In Soft computing: Theories and applications, (pp. 605–615). New York: Springer.

  9. Cea, D., Nin, J., Tous, R., Torres, J., & Ayguadé, E (2014). Towards the cloudification of the social networks analytics. In Modeling decisions for artificial intelligence (pp. 192–203). New York: Springer.

  10. Bai, J. (2013). Feasibility analysis of big log data real time search based on hbase and elasticsearch. In 2013 ninth international conference on natural computation (ICNC) (pp. 1166–1170). IEEE.

  11. Elasticsearch-elastic.co. Retrieved April 30, 2018, from https://www.elastic.co/guide/en/elasticsearch/reference/6.2/index.html.

  12. Gormley, C., & Tong, Z. (2015). Elasticsearch: The definitive guide: A distributed real-time search and analytics engine. Sebastopol: O’Reilly Media, Inc.

    Google Scholar 

  13. Your Window into the Elastic Stack. Retrieved 30, 2018, from https://www.elastic.co/products/kibana.

  14. Python Elasticsearch Client. Retrieved April 30, 2018, from https://elasticsearch-py.readthedocs.io/en/master/.

  15. Java Elasticsearch library-Elastic. Retrieved April 30, 2018, from https://www.elastic.co/guide/en/Elasticsearch/client/java-api/6.2/index.html.

  16. Getting Started with Logstash. Retrieved April 30, 2018, from https://www.elastic.co/guide/en/logstash/current/getting-started-with-logstash.html.

  17. Yang, F., Tschetter, E., Léauté, X., Ray, N., Merlino, G., & Ganguli, D. (2014). Druid: A real-time analytical data store. In Proceedings of the 2014 ACM SIGMOD international conference on Management of data (pp. 157–168). ACM.

  18. Burkitt, K. J., Dowling, E. G., & Branon, T. R. (2014). System and method for real-time processing, storage, indexing, and delivery of segmented video. US Patent 8,769,576.

  19. Hashem, I. A. T., Yaqoob, I., Anuar, N. B., Mokhtar, S., Gani, A., & Khan, S. U. (2015). The rise of big data on cloud computing: Review and open research issues. Information Systems, 47, 98–115.

    Article  Google Scholar 

  20. Yang, H., Park, M., Cho, M., Song, M., & Kim, S. (2014). A system architecture for manufacturing process analysis based on big data and process mining techniques. In 2014 IEEE international conference on big data (pp. 1024–1029). IEEE.

  21. Stelzer, G., Plaschkes, I., Oz-Levi, D., Alkelai, A., Olender, T., Zimmerman, S., et al. (2016). Varelect: The phenotype-based variation prioritizer of the genecards suite. BMC Genomics, 17(2), 444.

    Article  Google Scholar 

  22. Bagnasco, S., Berzano, D., Guarise, A., Lusso, S., Masera, M., & Vallero, S. (2015). Monitoring of IAAS and scientific applications on the cloud using the elasticsearch ecosystem. In Journal of physics: Conference series (Vol. 608, p. 012016). Bristol: IOP Publishing.

  23. Chen, D., Chen, Y., Brownlow, B. N., Kanjamala, P. P., Arredondo, C. A. G., Radspinner, B. L., et al. (2017). Real-time or near real-time persisting daily healthcare data into hdfs and elasticsearch index inside a big data platform. IEEE Transactions on Industrial Informatics, 13(2), 595–606.

    Article  Google Scholar 

  24. Coronel, J. B., & Mock, S. (2017). Designsafe: Using elasticsearch to share and search data on a science web portal. In Proceedings of the practice and experience in advanced research computing 2017 on sustainability, success and impact (p. 25). ACM.

Download references

Acknowledgements

This research is funded by the NSERC Discovery Grant; computing resources are provided by the High Performance Computing (HPC) Lab and Department of Computer Science at Lakehead University, Canada. Authors are grateful to Gaurav Sharma for initially setting up the data collection stream, Salimur Choudhury for providing insight on the data analysis and Andrew Heppner for reviewing and editing drafts.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vijay Mago.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shah, N., Willick, D. & Mago, V. A framework for social media data analytics using Elasticsearch and Kibana. Wireless Netw 28, 1179–1187 (2022). https://doi.org/10.1007/s11276-018-01896-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11276-018-01896-2

Keywords

Navigation