Skip to main content

On-Line Big-Data Processing for Visual Analytics with Argus-Panoptes

  • Conference paper
  • First Online:
Algorithmic Aspects of Cloud Computing (ALGOCLOUD 2018)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11409))

Included in the following conference series:

  • 436 Accesses

Abstract

Analyses with data mining and knowledge discovery techniques are not always successful as they occasionally yield no actionable results. This is especially true in the Big-Data context where we routinely deal with complex, heterogeneous, diverse and rapidly changing data. In this context, visual analytics play a key role in helping both experts and users to readily comprehend and better manage analyses carried on data stored in Infrastructure as a Service (IaaS) cloud services. To this end, humans should play a critical role in continually ascertaining the value of the processed information and are invariably deemed to be the instigators of actionable tasks. The latter is facilitated with the assistance of sophisticated tools that let humans interface with the data through vision and interaction. When working with Big-Data problems, both scale and nature of data undoubtedly present a barrier in implementing responsive applications. In this paper, we propose a software architecture that seeks to empower Big-Data analysts with visual analytics tools atop large-scale data stored in and processed by IaaS. Our key goal is to not only yield on-line analytic processing but also provide the facilities for the users to effectively interact with the underlying IaaS machinery. Although we focus on hierarchical and spatiotemporal datasets here, our proposed architecture is general and can be used to a wide number of application domains. The core design principles of our approach are: (a) On-line processing on cloud with Apache Spark. (b) Integration of interactive programming following the notebook paradigm through Apache Zeppelin. (c) Offering robust operation when data and/or schema change on the fly. Through experimentation with a prototype of our suggested architecture, we demonstrate not only the viability of our approach but also we show its value in a use-case involving publicly available crime data from United Kingdom.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Argus-Panoptes is a figure from Greek mythology, it was an “all-seeing” giant having a watchman role.

  2. 2.

    Source code repository is available at: https://github.com/panayiotis/visual_analytics.

  3. 3.

    Around 200 MB in total.

References

  1. Apache Zeppelin: Zeppelin: web-based notebook (2009). https://zeppelin.apache.org. Accessed 30 June 2018

  2. Cloudera: Hue is an open source analytics workbench for self service BI. (2009). http://gethue.com. Accessed 30 June 2018

  3. Daniel, K., Kohlhammer, J., Ellis, G., Mansman, F. (eds.): Mastering the Information Age Solving Problems with Visual Analytics. Eurographics Association (2010)

    Google Scholar 

  4. Dibia, V., Demiralp, Ç.: Data2Vis: automatic generation of data visualizations using sequence to sequence recurrent neural networks, April 2018. arxiv.org/abs/1804.03126

  5. Domingos, P.: A few useful things to know about machine learning. Commun. ACM 55(10), 78–87 (2012)

    Article  Google Scholar 

  6. EUROSTAT: NUTS - nomenclature of territorial units for statistics (2016). http://ec.europa.eu/eurostat/web/nuts/background. Accessed 30 June 2018

  7. Facebook Inc.: React: a JavaScript library for building user interfaces (2009). https://reactjs.org. Accessed 30 June 2018

  8. Fekete, J.D.: Visual analytics infrastructures: from data management to exploration. Computer 46(7), 22–29 (2013)

    Article  Google Scholar 

  9. Home Office, UK: ASB incidents, crime and outcomes (2015). https://data.police.uk/about/. Accessed 30 June 2018

  10. Jupyter Team: Jupyter project (2009). https://jupyter.org. Accessed 30 June 2018

  11. Keim, D.A.: Visual exploration of large data sets. Commun. ACM 44(8), 38–44 (2001)

    Article  Google Scholar 

  12. Liu, Z., Jiang, B., Heer, J.: ImMens: real-time visual querying of Big Data. Comput. Graph. Forum 32(3), 421–430 (2013)

    Article  Google Scholar 

  13. Novus Partners: NVD3: reusable charts for d3.js (2014). http://nvd3.org. Accessed 30 June 2018

  14. Sriharsha, R.: Magellan: geospatial analytics using spark (2015). https://github.com/harsha2010/magellan. Accessed 30 June 2018

  15. Siddiqui, T., Kim, A., Lee, J., Karahalios, K., Parameswaran, A.: Effortless data exploration with zenvisage: an expressive and interactive visual analytics system. Proc. VLDB Endow. 10(4), 457–468 (2016)

    Article  Google Scholar 

  16. Thomas, J.J., Cook, K.A.: Illuminating the path: the research and development agenda for visual analytics. IEEE Computer Society (2005). http://vis.pnnl.gov/pdf/RD_Agenda_VisualAnalytics.pdf

  17. Uber: Deck.gl large-scale WebGL-powered data visualization. https://uber.github.io/deck.gl

  18. Vartak, M., Huang, S., Siddiqui, T., Madden, S., Parameswaran, A.: Towards visualization recommendation systems. ACM SIGMOD Rec. 45(4), 34–39 (2017)

    Article  Google Scholar 

  19. Wong, P.C., Shen, H.W., Johnson, C.R., Chen, C., Ross, R.B.: The top 10 challenges in extreme-scale visual analytics. IEEE Comput. Graphics Appl. 32(4), 63–67 (2012)

    Article  Google Scholar 

  20. Wongsuphasawat, K., et al.: Voyager 2. In: Proceedings of 2017 CHI Conference on Human Factors in Computing Systems (CHI 2017), Denver, pp. 2648–2659, May 2017)

    Google Scholar 

  21. Zaharia, M., et al.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of 9th USENIX Conference on Networked Systems Design and Implementation (NSDI 2012), San Jose (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Panayiotis I. Vlantis or Alex Delis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Vlantis, P.I., Delis, A. (2019). On-Line Big-Data Processing for Visual Analytics with Argus-Panoptes. In: Disser, Y., Verykios, V. (eds) Algorithmic Aspects of Cloud Computing. ALGOCLOUD 2018. Lecture Notes in Computer Science(), vol 11409. Springer, Cham. https://doi.org/10.1007/978-3-030-19759-9_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-19759-9_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-19758-2

  • Online ISBN: 978-3-030-19759-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics