Skip to main content

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 522))

  • 1020 Accesses

Abstract

The presented paper describes the design and implementation of R functions for twitter feeds analysis and visualization based on a combination of analytical technologies with big data processing tools. The main idea was to utilize the Hadoop processing framework and its storage and computational capabilities in analytical tasks designed and implemented in R language. For such purposes, we decided to use the Hadoop HDFS and MapReduce v2 for storage and handling of the processing logic connected via Tessera framework to analytical functions written in R. The results of the analysis were presented as the graph visualizations. Visualizations were implemented using the Trelliscope framework for flexible visualizations of large complex data in R environment in fast and effective fashion.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Tessera—http://tessera.io/.

  2. 2.

    R project—https://www.r-project.org/.

  3. 3.

    Datadr—http://tessera.io/docs-datadr/.

  4. 4.

    RHIPE—http://tessera.io/docs-RHIPE/.

  5. 5.

    Trelliscope—http://tessera.io/docs-trelliscope/.

  6. 6.

    UrbanSensing project—http://urban-sensing.eu/.

  7. 7.

    http://bokeh.pydata.org/.

  8. 8.

    http://hafen.github.io/rbokeh/.

  9. 9.

    http://leafletjs.com/.

References

  1. White, T.: Hadoop: The Definitive Guide, 1st edn. O’Reilly Media, Inc. (2009)

    Google Scholar 

  2. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters, In Sixth Symposium on Operating System Design and Implementation, OSDI’04, pp. 107–113. San Francisco, CA (2004)

    Google Scholar 

  3. Tan, Y.S.: Hadoop framework: impact of data organization on performance. J. Softw. Pract. Exp. (2011). ISSN: 0038-0644

    Google Scholar 

  4. Vavilapalli, V.K., et. al.: Apache Hadoop YARN: yet another resource negotiator. In: Proceedings of the 4th Annual Symposium on Cloud Computing (SOCC’13). ACM, New York, Article 5 (2013)

    Google Scholar 

  5. Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. In: Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing (HotCloud’10). Berkeley, CA (2010)

    Google Scholar 

  6. Mittal, A., Pathak, S., Bannard, T.: RHadoop: An Improved Execution Environment for Restricted MapReduce Programs (2013)

    Google Scholar 

  7. Guha, S., Hafen, R., Rounds, J., Xia, J., Li, J., Xi, B., Cleveland, W.: Large complex data: divide and recombine (D&R) with RHIPE. Stat 1, 53–67 (2012)

    Article  Google Scholar 

  8. Hafen, R., Gosink, L., McDermott, J., Rodland, K., Kleese-Van Dam, K., Cleveland, W.S: Trelliscope: a system for detailed visualization in the deep analysis of large complex. In: Proceedings of the 2013 IEEE Symposium on Large-Scale Data Analysis and Visualization (LDAV), pp. 105–112 (2013)

    Google Scholar 

Download references

Acknowledgments

The work presented in this paper was supported by the KEGA project under grant No. 025TUKE-4/2015 and also by the VEGA project under grant No. 1/0493/16.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Martin Sarnovsky .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Sarnovsky, M., Butka, P., Paulina, J. (2017). Social-Media Data Analysis Using Tessera Framework in the Hadoop Cluster Environment. In: Grzech, A., Świątek, J., Wilimowska, Z., Borzemski, L. (eds) Information Systems Architecture and Technology: Proceedings of 37th International Conference on Information Systems Architecture and Technology – ISAT 2016 – Part II. Advances in Intelligent Systems and Computing, vol 522. Springer, Cham. https://doi.org/10.1007/978-3-319-46586-9_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-46586-9_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-46585-2

  • Online ISBN: 978-3-319-46586-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics