Complex Data Analysis

Bae, Juhee; Karlsson, Alexander; Mellin, Jonas; Ståhl, Niclas; Torra, Vicenç

doi:10.1007/978-3-319-97556-6_9

Juhee Bae⁴,
Alexander Karlsson⁴,
Jonas Mellin⁴,
Niclas Ståhl⁴ &
…
Vicenç Torra⁴

Part of the book series: Studies in Big Data ((SBD,volume 46))

2372 Accesses

Abstract

Data science applications often need to deal with data that does not fit into the standard entity-attribute-value model. In this chapter we discuss three of these other types of data. We discuss texts, images and graphs. The importance of social media is one of the reason for the interest on graphs as they are a way to represent social networks and, in general, any type of interaction between people. In this chapter we present examples of tools that can be used to extract information and, thus, analyze these three types of data. In particular, we discuss topic modeling using a hierarchical statistical model as a way to extract relevant topics from texts, image analysis using convolutional neural networks, and measures and visual methods to summarize information from graphs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Hardcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Aggarwal, C. C., & Zhai, C. X. (2012). Mining text data. Springer Science & Business Media.
Google Scholar
Bae, J., & Watson, B. (2011). Developing and evaluating quilts for the depiction of large layered graphs. IEEE Transactions on Visualization and Computer Graphics (TVCG / InfoVis11).
Google Scholar
Bezerianos, A., Chevalier, F., Dragicevic, P., Elmqvist, N., & Fekete, J. D. (2010). Graphdice: A system for exploring multivariate social networks. In Proceedings of Eurographics/IEEE-VGTC Symposium on Visualization (Eurovis 2010).
Article Google Scholar
Bezerianos, A., Dragicevic, P., Fekete, J.-D., Bae, J., & Watson, B. (2010). Geneaquilts: A system for exploring large genealogies. IEEE Transactions on Visualization and Computer Graphics (TVCG / InfoVis10).
Google Scholar
Bifet, A., & Gavaldà, R. (2007). Learning from time-changing data with adaptive windowing. In Proceedings of the SIAM International Conference on Data Mining.
Chapter Google Scholar
Blei, D. M. (2012). Probabilistic topic models. Communications of the ACM, 55(4), 77–84.
Article Google Scholar
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, 3:993–1022.
Google Scholar
Choi, Y., Lee, H., & Irani, Z. (2016). Big data-driven fuzzy cognitive map for prioritising it service procurement in the public sector. Annals of Operations Research.
Google Scholar
Dahl, G. E., Sainath, T. N., & Hinton, G. E. (2013). Improving deep neural networks for LVCSR using rectified linear units and dropout. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 8609–8613). IEEE.
Google Scholar
Duarte, D., & Ståhl, N. (2018). Machine learning. In A. Said, & V. Torra (Eds.), Data science in practice. Springer.
Google Scholar
Friendly, M., & Denis, D. (2005). The early origins and development of the scatterplot. Journal of the History of the Behavioral Sciences, 41(2), 103–130.
Article Google Scholar
Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. Proceedings of the National Academy of Sciences, 101(suppl 1), 5228–5235.
Article Google Scholar
Grn, B., & Hornik, K. (2011). topicmodels: An R package for fitting topic models. Journal of Statistical Software, Articles, 40(13), 1–30.
Google Scholar
Inselberg, A. (1985). The plane with parallel coordinates. Visual Computer, 1(4), 69–91.
Article MathSciNet Google Scholar
Kim, G.-H., Trimi, S., & Chung, J.-H. (2014). Big-data applications in the government sector. Communications of the ACM, 57(3), 78–85.
Article Google Scholar
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems (pp. 1097–1105).
Google Scholar
Polikar, R. (2006). Ensemble based systems in decision making. Circuits and Systems Magazine, IEEE, 6(3), 21–45.
Article Google Scholar
Salton, G., & Buckley, C. (1988). Term-weighting approaches in automatic text retrieval. Information Processing & Management, 24(5), 513–523.
Article Google Scholar
Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556.
Snidaro, L., Garcia, J., Llinas, J., & Blasch, E. (Eds.). (2016). Context-enhanced information fusion: Boosting real-world performance with domain knowledge. Cham, Switzerland: Springer. OCLC: 951075950.
Google Scholar
Sonka, M., Hlavac, V., & Boyle, R. (2014). Image processing, analysis, and machine vision. Cengage Learning.
Google Scholar
Steed, C., Shipman, G., Thornton, P., Ricciuto, D., Erickson, D., & Branstetter, M. (2012). Practical application of parallel coordinates for climate model analysis. In: International conference on computational science, data mining in earth science.
Article Google Scholar
Viau, C., Mcguffin, M. J., Chiricota, Y., & Jurisica, I. (2010). The FlowVizMenu and parallel scatterplot matrix: Hybrid multidimensional visualizations for network exploration. IEEE Transactions on Visualization and Computer Graphics.
Google Scholar
Yuan, P., Guo, H., Xiao, H., Zhou, H., & Qu, X. (2010). Scattering points in parallel coordinates. IEEE Transactions on Visualization and Computer Graphics, 15(6), 1001–1008.
Article Google Scholar
Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. In: European conference on computer vision (pp. 818–833). Springer.
Google Scholar

Download references

Author information

Authors and Affiliations

School of Informatics, University of Skövde, Skövde, Sweden
Juhee Bae, Alexander Karlsson, Jonas Mellin, Niclas Ståhl & Vicenç Torra

Authors

Juhee Bae
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Karlsson
View author publications
You can also search for this author in PubMed Google Scholar
Jonas Mellin
View author publications
You can also search for this author in PubMed Google Scholar
Niclas Ståhl
View author publications
You can also search for this author in PubMed Google Scholar
Vicenç Torra
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Juhee Bae .

Editor information

Editors and Affiliations

University of Skövde, Skövde, Sweden
Alan Said
University of Skövde, Skövde, Sweden
Vicenç Torra

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Bae, J., Karlsson, A., Mellin, J., Ståhl, N., Torra, V. (2019). Complex Data Analysis. In: Said, A., Torra, V. (eds) Data Science in Practice. Studies in Big Data, vol 46. Springer, Cham. https://doi.org/10.1007/978-3-319-97556-6_9

Download citation

DOI: https://doi.org/10.1007/978-3-319-97556-6_9
Published: 20 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-97555-9
Online ISBN: 978-3-319-97556-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics