Abstract
Professionals in the Life Sciences are faced with increasing masses of complex data sets. Very few data is structured, where traditional information retrieval methods work perfectly. A large portion of data is weakly structured; however, the majority falls into the category of unstructured data. To discover previously unknown knowledge from this data, we need advanced and novel methods to deal with the data from two aspects: time (e.g. information entropy) and space (e.g. computational topology). In this paper we show some examples of biomedical data and discuss a taxonomy of data with the specifics on medical data sets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bell, G., Hey, T., Szalay, A.: Beyond the data deluge. Science 323(5919), 1297–1298 (2009)
Hey, T., Tansley, S., Tolle, K.: The Fourth Paradigm: Data-Intensive Scientific Discovery. Microsoft Research, Redmond (2009)
Holzinger, A.: Weakly structured data in health-informatics: the challenge for human-computer interaction (2011)
Patel, V.L., Kahol, K., Buchman, T.: Biomedical complexity and error. J. Biomed. Inform. 44(3), 387–389 (2011)
Holzinger, A.: Biomedical Informatics: Discovering Knowledge in Big Data. Springer, New York (2014)
Holzinger, A., Jurisica, I.: Knowledge discovery and data mining in biomedical informatics: the future is in integrative, interactive machine learning solutions. In: Holzinger, A., Jurisica, I. (eds.) Interactive Knowledge Discovery and Data Mining in Biomedical Informatics. LNCS, vol. 8401, pp. 1–18. Springer, Heidelberg (2014)
Holzinger, A., Geierhofer, R., Modritscher, F., Tatzl, R.: Semantic information in medical information systems: utilization of text mining techniques to analyze medical diagnoses. J. Univ. Comput. Sci. 14(22), 3781–3795 (2008)
Gregory, J., Mattison, J.E., Linde, C.: Naming notes - transitions from free-text to structured entry. Meth. Inf. Med. 34(1–2), 57–67 (1995)
Holzinger, A., Kainz, A., Gell, G., Brunold, M., Maurer, H.: Interactive computer assisted formulation of retrieval requests for a medical information system using an intelligent tutoring system. World Conference on Educational Multimedia, Hypermedia and Telecommunications ED-MEDIA 2000, pp. 431–436. AACE, Charlottesville (2000)
Lovis, C., Baud, R.H., Planche, P.: Power of expression in the electronic patient record: structured data or narrative text? Int. J. Med. Inf. 58, 101–110 (2000)
Pascucci, V., Tricoche, X., Hagen, H., Tierny, J.: Topological Methods in Data Analysis and Visualization: Theory, Algorithms, and Applications. Springer, Heidelberg (2011)
Blandford, A., Attfield, S.: Interacting with information. Synth. Lect. Hum. Centered Inf. 3(1), 1–99 (2010)
Kaski, S., Peltonen, J.: Dimensionality reduction for data visualization (applications corner). IEEE Signal Process. Mag. 28(2), 100–104 (2011)
Holzinger, A., Hörtenhuber, M., Mayer, C., Bachler, M., Wassertheurer, S., Pinho, A.J., Koslicki, D.: On entropy-based data mining. In: Holzinger, A., Jurisica, I. (eds.) Interactive Knowledge Discovery and Data Mining in Biomedical Informatics. LNCS, vol. 8401, pp. 209–226. Springer, Heidelberg (2014)
Beale, R.: Supporting serendipity: using ambient intelligence to augment user exploration for data mining and web browsing. Int. J. Hum. Comput. Stud. 65(5), 421–433 (2007)
Yau, N.: Seeing the World in Data, pp. 246–248. Princeton Architectural Press, New York (2011)
Pržulj, N., Higham, D.J.: Modelling protein-protein interaction networks via a stickiness index. J. Roy. Soc. Interface 3(10), 711–716 (2006)
Emmert-Streib, F., Dehmer, M. (eds.): Analysis of Microarray Data: A Network-Based Approach. Wiley VCH Publishing, Chichester (2010)
Shi, L., Lei, X., Zhang, A.: Protein complex detection with semi-supervised learning in protein interaction networks. Proteome Sci. 9(Suppl. 1), S5 (2011)
Stelzl, U., Worm, U., Lalowski, M., Haenig, C., Brembeck, F.H., Goehler, H., Stroedicke, M., Zenkner, M., Schoenherr, A., Koeppen, S., Timm, J., Mintzlaff, S., Abraham, C., Bock, N., Kietzmann, S., Goedde, A., Toksz, E., Droege, A., Krobitsch, S., Korn, B., Birchmeier, W., Lehrach, H., Wanker, E.E.: A human protein-protein interaction network: a resource for annotating the proteome. Cell 122(6), 957–968 (2005)
Zhang, A.: Protein Interaction Networks: Computational Analysis. Cambridge University Press, Cambridge (2009)
Arrais, J.P., Lopes, P., Oliveira, J.L.: Challenges storing and representing biomedical data. In: Holzinger, A., Simonic, K.-M. (eds.) USAB 2011. LNCS, vol. 7058, pp. 53–62. Springer, Heidelberg (2011)
Wiltgen, M., Holzinger, A.: Visualization in Bioinformatics: Protein Structures with Physicochemical and Biological Annotations, pp. 69–74. Czech Technical University (CTU), Prague (2005)
Wiltgen, M., Holzinger, A., Tilz, G.P.: Interactive analysis and visualization of macromolecular interfaces between proteins. In: Holzinger, A. (ed.) USAB 2007. LNCS, vol. 4799, pp. 199–212. Springer, Heidelberg (2007)
Barabási, A.L., Albert, R., Jeong, H.: Mean-field theory for scale-free random networks. Physica A: Stat. Mech. Appl. 272(1–2), 173–187 (1999)
Newman, M.: The structure and function of complex networks. SIAM Rev. 45, 167–256 (2003)
Costa, L., Rodrigues, F., Cristino, A.: Complex networks: the key to systems biology. Genet. Mol. Biol. 31(3), 591–601 (2008)
Dastani, M.: The role of visual perception in data visualization. J. Vis. Lang. Comput. 13, 601–622 (2002)
Egenhofer, M.: Reasoning about binary topological relations. In: Günther, O., Schek, H.-J. (eds.) SSD 1991. LNCS, vol. 525, pp. 141–160. Springer, Heidelberg (1991)
Egenhofer, M., Herring, J.: Categorizing binary topological relations between regions, lines, and points in geographic databases. Technical Report, Department of Surveying Engineering, University of Maine (1990)
Aleksandrov, P.: Elementary Concepts of Topology. Dover Publications, New York (1961)
Egenhofer, M., Franzosa, R.: On the equivalence of topological relations. Int. J. Geogr. Inf. Syst. 9(2), 133–152 (1995)
Stuckenschmidt, H., van Harmelen, F.: Information Sharing on the Semantic Web. Advanced Information and Knowledge Processing. Springer, Heidelberg (2005)
Kapovich, I., Myasnikov, A., Schupp, P., Shpilrain, V.: Generic-case complexity, decision problems in group theory, and random walks. J. Algebra 264(2), 665–694 (2003)
de Silva, V., Carlsson, G.: Topological estimation using witness complexes. In: Proceedings of Eurographics Symposium on Point-Based Graphics, pp. 157–166 (2004)
Simon, H.A.: The structure of ill structured problems. Artif. Intell. 4(3–4), 181–201 (1973)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley, New York (2000)
Kreuzthaler, M., Bloice, M., Faulstich, L., Simonic, K., Holzinger, A.: A comparison of different retrieval strategies working on medical free texts. J. Univ. Comput. Sci. 17(7), 1109–1133 (2011)
Ahmadian, L., van Engen-Verheul, M., Bakhshi-Raiez, F., Peek, N., Cornet, R., de Keizer, N.F.: The role of standardized data and terminological systems in computerized clinical decision support systems: Literature review and survey. Int. J. Med. Inf. 80(2), 81–93 (2011)
Batini, C., Scannapieco, M.: Data Quality: Concepts, Methodologies and Techniques. Springer, Heidelberg (2006)
Richman, J.S.: Multivariate Neighborhood Sample Entropy: A Method for Data Reduction and Prediction of Complex Data, pp. 297–408. Elsevier, Amsterdam (2011)
Komaroff, A.L.: The variability and inaccuracy of medical data. Proc. IEEE 67(9), 1196–1207 (1979)
Walsh, J.E.: Analyzing medical data: some statistical considerations. IRE Trans. Med. Electron. ME–7(4), 362–366 (1960)
Miller, R., McNeil, M., Challinor, S., Masarie Jr, F., Myers, J.: The internist-1/quick medical reference project-status report. West. J. Med. 145(6), 816 (1986)
Davenport, T., Glaser, J.: Just-in-time delivery comes to knowledge management. Harvard Bus. Rev. 80(7), 107–111 (2002)
Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., Byers, A.H.: Big Data: The Next Frontier for Innovation, Competition, and Productivity. McKinsey Global Institute, Washington (DC) (2011)
Card, S.K., Mackinlay, J.D., Shneiderman, B.: Information Visualization: Using Vision to Think, pp. 1–34. Morgan Kaufmann, San Francisco (1999).
Inselberg, A.: Parallel Coordinates: Visual Multidimensional Geometry and Its Applications (foreword by Ben Shneiderman). Springer, Heidelberg (2009)
Novakova, L., Stepankova, O.: Radviz and identification of clusters in multidimensional data. In: 13th International Conference on Information Visualisation, pp. 104–109 (2009)
Meyer-Spradow, J., Stegger, L., Doering, C., Ropinski, T., Hinirchs, K.: Glyph-based spect visualization for the diagnosis of coronary artery disease. IEEE Trans. Visual Comput. Graphics 14(6), 1499–1506 (2008)
Fox, P., Hendler, J.: Changing the equation on scientific data visualization. Science 331(6018), 705–708 (2011)
de Jong, T.: Computer simulations - technological advances in inquiry learning. Science 312(5773), 532–533 (2006)
Chittaro, L.: Information visualization and its application to medicine. Artif. Intell. Med. 22(2), 81–88 (2001)
Johnson, C.R., MacLeod, R., Parker, S.G., Weinstein, D.: Biomedical computing and visualization software environments. Commun. ACM 47(11), 64–71 (2004)
Ebner, M., Holzinger, A.: Successful implementation of user-centered game based learning in higher education an example from civil engineering. Comput. Educ. 49(3), 873–890 (2007)
Kickmeier-Rust, M.D., Peirce, N., Conlan, O., Schwarz, D., Verpoorten, D., Albert, D.: Immersive Digital Games: The Interfaces for Next-Generation E-Learning?, pp. 647–656. Springer, Heidelberg (2007)
Cooper, S., Khatib, F., Treuille, A., Barbero, J., Lee, J., Beenen, M., Leaver-Fay, A., Baker, D., Popovic, Z., Players, F.: Predicting protein structures with a multiplayer online game. Nature 466(7307), 756–760 (2010)
Mayer, R.E., Hegarty, M., Mayer, S., Campbell, J.: When static media promote active learning: annotated illustrations versus narrated animations in multimedia instruction. J. Exp. Psychol. Appl. 11(4), 256–265 (2005)
Holzinger, A., Kickmeier-Rust, M., Albert, D.: Dynamic media in computer science education; content complexity and learning performance: is less more? Educ. Technol. Soc. 11(1), 279–290 (2008)
Hessinger, M., Holzinger, A., Leitner, D., Wassertheurer, S.: Haemodynamic models for education in physiology. Math. Comput. Simul. Simul. News Eur. 16(2), 64–68 (2006)
McDonald, D.: The relation of pulsatile pressure to flow in arteries. J. Physiol. 127, 533–552 (1955)
Womersley, J.R.: Method for the calculation of velocity, rate of flow and viscous drag in arteries when the pressure gradient is known. J. Physiol. 127(3), 553–563 (1955)
Pedley, T.: The Fluid Mechanics of Large Blood Vessels. Cambridge University Press, Cambridge (1980)
Leitner, D., Wassertheurer, S., Hessinger, M., Holzinger, A.: A lattice boltzmann model for pulsative blood flow in elastic vessels. New Comput. Med. Inf. Health Care 123(4), 64–68 (2006). Special Edition of Springer e&i
Holzinger, A., Ebner, M.: Interaction and Usability of Simulations & Animations: A Case Study of the Flash Technology, pp. 777–780. IOS Press, Zurich (2003)
Holzinger, A.: Application of rapid prototyping to the user interface development for a virtual medical campus. IEEE Softw. 21(1), 92–99 (2004)
Holzinger, A.: Usability engineering for software developers. Commun. ACM 48(1), 71–74 (2005)
Holzinger, A., Kickmeier-Rust, M.D., Wassertheurer, S., Hessinger, M.: Learning performance with interactive simulations in medical education: lessons learned from results of learning complex physiological models with the haemodynamics simulator. Comput. Educ. 52(2), 292–301 (2009)
Schrödinger, E.: What Is Life? The Physical Aspect of the Living Cell. Dublin Institute for Advanced Studies at Trinity College, Dublin (1944)
Wing, J.M.: Computational thinking. Commun. ACM 49(3), 33–35 (2006)
Fisher, J., Harel, D., Henzinger, T.: Biology as reactivity. Commun. ACM 54(10), 72–82 (2011)
Vendruscolo, M., Dobson, C.M.: Protein dynamics: moore’s law in molecular biology. Curr. Biol. 21(2), R68–R70 (2011)
Holzinger, A.: Process Guide for Students for Interdisciplinary Work in Computer Science/Informatics, 2nd edn. BoD, Norderstedt (2010)
Wing, J.M.: Computational thinking and thinking about computing. Philos. Trans. Roy. Soc. A: Math. Phys. Eng. Sci. 366(1881), 3717–3725 (2008)
Holzinger, A., Dehmer, M., Jurisica, I.: Knowledge discovery and interactive data mining in bioinformatics - state-of-the-art, future challenges and research directions. BMC Bioinform. 15(Suppl 6), I1 (2014)
Simon, H.: Designing Organizations for an Information-Rich World, pp. 37–72. The Johns Hopkins Press, Baltimore (1971)
Holzinger, A.: Human-computer interaction and knowledge discovery (HCI-KDD): what is the benefit of bringing those two fields to work together? In: Cuzzocrea, A., Kittl, C., Simos, D.E., Weippl, E., Xu, L. (eds.) CD-ARES 2013. LNCS, vol. 8127, pp. 319–328. Springer, Heidelberg (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Holzinger, A., Stocker, C., Dehmer, M. (2014). Big Complex Biomedical Data: Towards a Taxonomy of Data. In: Obaidat, M., Filipe, J. (eds) E-Business and Telecommunications. ICETE 2012. Communications in Computer and Information Science, vol 455. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44791-8_1
Download citation
DOI: https://doi.org/10.1007/978-3-662-44791-8_1
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-44790-1
Online ISBN: 978-3-662-44791-8
eBook Packages: Computer ScienceComputer Science (R0)