Skip to main content

Complex Data Analysis

  • Chapter
  • First Online:
Book cover Data Science in Practice

Part of the book series: Studies in Big Data ((SBD,volume 46))

  • 2372 Accesses

Abstract

Data science applications often need to deal with data that does not fit into the standard entity-attribute-value model. In this chapter we discuss three of these other types of data. We discuss texts, images and graphs. The importance of social media is one of the reason for the interest on graphs as they are a way to represent social networks and, in general, any type of interaction between people. In this chapter we present examples of tools that can be used to extract information and, thus, analyze these three types of data. In particular, we discuss topic modeling using a hierarchical statistical model as a way to extract relevant topics from texts, image analysis using convolutional neural networks, and measures and visual methods to summarize information from graphs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 139.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aggarwal, C. C., & Zhai, C. X. (2012). Mining text data. Springer Science & Business Media.

    Google Scholar 

  2. Bae, J., & Watson, B. (2011). Developing and evaluating quilts for the depiction of large layered graphs. IEEE Transactions on Visualization and Computer Graphics (TVCG / InfoVis11).

    Google Scholar 

  3. Bezerianos, A., Chevalier, F., Dragicevic, P., Elmqvist, N., & Fekete, J. D. (2010). Graphdice: A system for exploring multivariate social networks. In Proceedings of Eurographics/IEEE-VGTC Symposium on Visualization (Eurovis 2010).

    Article  Google Scholar 

  4. Bezerianos, A., Dragicevic, P., Fekete, J.-D., Bae, J., & Watson, B. (2010). Geneaquilts: A system for exploring large genealogies. IEEE Transactions on Visualization and Computer Graphics (TVCG / InfoVis10).

    Google Scholar 

  5. Bifet, A., & Gavaldà, R. (2007). Learning from time-changing data with adaptive windowing. In Proceedings of the SIAM International Conference on Data Mining.

    Chapter  Google Scholar 

  6. Blei, D. M. (2012). Probabilistic topic models. Communications of the ACM, 55(4), 77–84.

    Article  Google Scholar 

  7. Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3:993–1022.

    Google Scholar 

  8. Choi, Y., Lee, H., & Irani, Z. (2016). Big data-driven fuzzy cognitive map for prioritising it service procurement in the public sector. Annals of Operations Research.

    Google Scholar 

  9. Dahl, G. E., Sainath, T. N., & Hinton, G. E. (2013). Improving deep neural networks for LVCSR using rectified linear units and dropout. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 8609–8613). IEEE.

    Google Scholar 

  10. Duarte, D., & Ståhl, N. (2018). Machine learning. In A. Said, & V. Torra (Eds.), Data science in practice. Springer.

    Google Scholar 

  11. Friendly, M., & Denis, D. (2005). The early origins and development of the scatterplot. Journal of the History of the Behavioral Sciences, 41(2), 103–130.

    Article  Google Scholar 

  12. Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. Proceedings of the National Academy of Sciences, 101(suppl 1), 5228–5235.

    Article  Google Scholar 

  13. Grn, B., & Hornik, K. (2011). topicmodels: An R package for fitting topic models. Journal of Statistical Software, Articles, 40(13), 1–30.

    Google Scholar 

  14. Inselberg, A. (1985). The plane with parallel coordinates. Visual Computer, 1(4), 69–91.

    Article  MathSciNet  Google Scholar 

  15. Kim, G.-H., Trimi, S., & Chung, J.-H. (2014). Big-data applications in the government sector. Communications of the ACM, 57(3), 78–85.

    Article  Google Scholar 

  16. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems (pp. 1097–1105).

    Google Scholar 

  17. Polikar, R. (2006). Ensemble based systems in decision making. Circuits and Systems Magazine, IEEE, 6(3), 21–45.

    Article  Google Scholar 

  18. Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information Processing & Management, 24(5), 513–523.

    Article  Google Scholar 

  19. Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556.

  20. Snidaro, L., Garcia, J., Llinas, J., & Blasch, E. (Eds.). (2016). Context-enhanced information fusion: Boosting real-world performance with domain knowledge. Cham, Switzerland: Springer. OCLC: 951075950.

    Google Scholar 

  21. Sonka, M., Hlavac, V., & Boyle, R. (2014). Image processing, analysis, and machine vision. Cengage Learning.

    Google Scholar 

  22. Steed, C., Shipman, G., Thornton, P., Ricciuto, D., Erickson, D., & Branstetter, M. (2012). Practical application of parallel coordinates for climate model analysis. In: International conference on computational science, data mining in earth science.

    Article  Google Scholar 

  23. Viau, C., Mcguffin, M. J., Chiricota, Y., & Jurisica, I. (2010). The FlowVizMenu and parallel scatterplot matrix: Hybrid multidimensional visualizations for network exploration. IEEE Transactions on Visualization and Computer Graphics.

    Google Scholar 

  24. Yuan, P., Guo, H., Xiao, H., Zhou, H., & Qu, X. (2010). Scattering points in parallel coordinates. IEEE Transactions on Visualization and Computer Graphics, 15(6), 1001–1008.

    Article  Google Scholar 

  25. Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. In: European conference on computer vision (pp. 818–833). Springer.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Juhee Bae .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer International Publishing AG, part of Springer Nature

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Bae, J., Karlsson, A., Mellin, J., Ståhl, N., Torra, V. (2019). Complex Data Analysis. In: Said, A., Torra, V. (eds) Data Science in Practice. Studies in Big Data, vol 46. Springer, Cham. https://doi.org/10.1007/978-3-319-97556-6_9

Download citation

Publish with us

Policies and ethics