Skip to main content

Big Complex Biomedical Data: Towards a Taxonomy of Data

  • Conference paper
  • First Online:
E-Business and Telecommunications (ICETE 2012)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 455))

Included in the following conference series:

Abstract

Professionals in the Life Sciences are faced with increasing masses of complex data sets. Very few data is structured, where traditional information retrieval methods work perfectly. A large portion of data is weakly structured; however, the majority falls into the category of unstructured data. To discover previously unknown knowledge from this data, we need advanced and novel methods to deal with the data from two aspects: time (e.g. information entropy) and space (e.g. computational topology). In this paper we show some examples of biomedical data and discuss a taxonomy of data with the specifics on medical data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bell, G., Hey, T., Szalay, A.: Beyond the data deluge. Science 323(5919), 1297–1298 (2009)

    Article  Google Scholar 

  2. Hey, T., Tansley, S., Tolle, K.: The Fourth Paradigm: Data-Intensive Scientific Discovery. Microsoft Research, Redmond (2009)

    Google Scholar 

  3. Holzinger, A.: Weakly structured data in health-informatics: the challenge for human-computer interaction (2011)

    Google Scholar 

  4. Patel, V.L., Kahol, K., Buchman, T.: Biomedical complexity and error. J. Biomed. Inform. 44(3), 387–389 (2011)

    Article  Google Scholar 

  5. Holzinger, A.: Biomedical Informatics: Discovering Knowledge in Big Data. Springer, New York (2014)

    Book  Google Scholar 

  6. Holzinger, A., Jurisica, I.: Knowledge discovery and data mining in biomedical informatics: the future is in integrative, interactive machine learning solutions. In: Holzinger, A., Jurisica, I. (eds.) Interactive Knowledge Discovery and Data Mining in Biomedical Informatics. LNCS, vol. 8401, pp. 1–18. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  7. Holzinger, A., Geierhofer, R., Modritscher, F., Tatzl, R.: Semantic information in medical information systems: utilization of text mining techniques to analyze medical diagnoses. J. Univ. Comput. Sci. 14(22), 3781–3795 (2008)

    Google Scholar 

  8. Gregory, J., Mattison, J.E., Linde, C.: Naming notes - transitions from free-text to structured entry. Meth. Inf. Med. 34(1–2), 57–67 (1995)

    Google Scholar 

  9. Holzinger, A., Kainz, A., Gell, G., Brunold, M., Maurer, H.: Interactive computer assisted formulation of retrieval requests for a medical information system using an intelligent tutoring system. World Conference on Educational Multimedia, Hypermedia and Telecommunications ED-MEDIA 2000, pp. 431–436. AACE, Charlottesville (2000)

    Google Scholar 

  10. Lovis, C., Baud, R.H., Planche, P.: Power of expression in the electronic patient record: structured data or narrative text? Int. J. Med. Inf. 58, 101–110 (2000)

    Article  Google Scholar 

  11. Pascucci, V., Tricoche, X., Hagen, H., Tierny, J.: Topological Methods in Data Analysis and Visualization: Theory, Algorithms, and Applications. Springer, Heidelberg (2011)

    Book  Google Scholar 

  12. Blandford, A., Attfield, S.: Interacting with information. Synth. Lect. Hum. Centered Inf. 3(1), 1–99 (2010)

    Article  Google Scholar 

  13. Kaski, S., Peltonen, J.: Dimensionality reduction for data visualization (applications corner). IEEE Signal Process. Mag. 28(2), 100–104 (2011)

    Article  Google Scholar 

  14. Holzinger, A., Hörtenhuber, M., Mayer, C., Bachler, M., Wassertheurer, S., Pinho, A.J., Koslicki, D.: On entropy-based data mining. In: Holzinger, A., Jurisica, I. (eds.) Interactive Knowledge Discovery and Data Mining in Biomedical Informatics. LNCS, vol. 8401, pp. 209–226. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  15. Beale, R.: Supporting serendipity: using ambient intelligence to augment user exploration for data mining and web browsing. Int. J. Hum. Comput. Stud. 65(5), 421–433 (2007)

    Article  Google Scholar 

  16. Yau, N.: Seeing the World in Data, pp. 246–248. Princeton Architectural Press, New York (2011)

    Google Scholar 

  17. Pržulj, N., Higham, D.J.: Modelling protein-protein interaction networks via a stickiness index. J. Roy. Soc. Interface 3(10), 711–716 (2006)

    Article  Google Scholar 

  18. Emmert-Streib, F., Dehmer, M. (eds.): Analysis of Microarray Data: A Network-Based Approach. Wiley VCH Publishing, Chichester (2010)

    Google Scholar 

  19. Shi, L., Lei, X., Zhang, A.: Protein complex detection with semi-supervised learning in protein interaction networks. Proteome Sci. 9(Suppl. 1), S5 (2011)

    Article  Google Scholar 

  20. Stelzl, U., Worm, U., Lalowski, M., Haenig, C., Brembeck, F.H., Goehler, H., Stroedicke, M., Zenkner, M., Schoenherr, A., Koeppen, S., Timm, J., Mintzlaff, S., Abraham, C., Bock, N., Kietzmann, S., Goedde, A., Toksz, E., Droege, A., Krobitsch, S., Korn, B., Birchmeier, W., Lehrach, H., Wanker, E.E.: A human protein-protein interaction network: a resource for annotating the proteome. Cell 122(6), 957–968 (2005)

    Article  Google Scholar 

  21. Zhang, A.: Protein Interaction Networks: Computational Analysis. Cambridge University Press, Cambridge (2009)

    Book  Google Scholar 

  22. Arrais, J.P., Lopes, P., Oliveira, J.L.: Challenges storing and representing biomedical data. In: Holzinger, A., Simonic, K.-M. (eds.) USAB 2011. LNCS, vol. 7058, pp. 53–62. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  23. Wiltgen, M., Holzinger, A.: Visualization in Bioinformatics: Protein Structures with Physicochemical and Biological Annotations, pp. 69–74. Czech Technical University (CTU), Prague (2005)

    Google Scholar 

  24. Wiltgen, M., Holzinger, A., Tilz, G.P.: Interactive analysis and visualization of macromolecular interfaces between proteins. In: Holzinger, A. (ed.) USAB 2007. LNCS, vol. 4799, pp. 199–212. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  25. Barabási, A.L., Albert, R., Jeong, H.: Mean-field theory for scale-free random networks. Physica A: Stat. Mech. Appl. 272(1–2), 173–187 (1999)

    Article  Google Scholar 

  26. Newman, M.: The structure and function of complex networks. SIAM Rev. 45, 167–256 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  27. Costa, L., Rodrigues, F., Cristino, A.: Complex networks: the key to systems biology. Genet. Mol. Biol. 31(3), 591–601 (2008)

    Article  Google Scholar 

  28. Dastani, M.: The role of visual perception in data visualization. J. Vis. Lang. Comput. 13, 601–622 (2002)

    Article  Google Scholar 

  29. Egenhofer, M.: Reasoning about binary topological relations. In: Günther, O., Schek, H.-J. (eds.) SSD 1991. LNCS, vol. 525, pp. 141–160. Springer, Heidelberg (1991)

    Chapter  Google Scholar 

  30. Egenhofer, M., Herring, J.: Categorizing binary topological relations between regions, lines, and points in geographic databases. Technical Report, Department of Surveying Engineering, University of Maine (1990)

    Google Scholar 

  31. Aleksandrov, P.: Elementary Concepts of Topology. Dover Publications, New York (1961)

    Google Scholar 

  32. Egenhofer, M., Franzosa, R.: On the equivalence of topological relations. Int. J. Geogr. Inf. Syst. 9(2), 133–152 (1995)

    Article  Google Scholar 

  33. Stuckenschmidt, H., van Harmelen, F.: Information Sharing on the Semantic Web. Advanced Information and Knowledge Processing. Springer, Heidelberg (2005)

    Book  MATH  Google Scholar 

  34. Kapovich, I., Myasnikov, A., Schupp, P., Shpilrain, V.: Generic-case complexity, decision problems in group theory, and random walks. J. Algebra 264(2), 665–694 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  35. de Silva, V., Carlsson, G.: Topological estimation using witness complexes. In: Proceedings of Eurographics Symposium on Point-Based Graphics, pp. 157–166 (2004)

    Google Scholar 

  36. Simon, H.A.: The structure of ill structured problems. Artif. Intell. 4(3–4), 181–201 (1973)

    Article  Google Scholar 

  37. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley, New York (2000)

    Google Scholar 

  38. Kreuzthaler, M., Bloice, M., Faulstich, L., Simonic, K., Holzinger, A.: A comparison of different retrieval strategies working on medical free texts. J. Univ. Comput. Sci. 17(7), 1109–1133 (2011)

    Google Scholar 

  39. Ahmadian, L., van Engen-Verheul, M., Bakhshi-Raiez, F., Peek, N., Cornet, R., de Keizer, N.F.: The role of standardized data and terminological systems in computerized clinical decision support systems: Literature review and survey. Int. J. Med. Inf. 80(2), 81–93 (2011)

    Article  Google Scholar 

  40. Batini, C., Scannapieco, M.: Data Quality: Concepts, Methodologies and Techniques. Springer, Heidelberg (2006)

    Google Scholar 

  41. Richman, J.S.: Multivariate Neighborhood Sample Entropy: A Method for Data Reduction and Prediction of Complex Data, pp. 297–408. Elsevier, Amsterdam (2011)

    Google Scholar 

  42. Komaroff, A.L.: The variability and inaccuracy of medical data. Proc. IEEE 67(9), 1196–1207 (1979)

    Article  Google Scholar 

  43. Walsh, J.E.: Analyzing medical data: some statistical considerations. IRE Trans. Med. Electron. ME–7(4), 362–366 (1960)

    Article  Google Scholar 

  44. Miller, R., McNeil, M., Challinor, S., Masarie Jr, F., Myers, J.: The internist-1/quick medical reference project-status report. West. J. Med. 145(6), 816 (1986)

    Google Scholar 

  45. Davenport, T., Glaser, J.: Just-in-time delivery comes to knowledge management. Harvard Bus. Rev. 80(7), 107–111 (2002)

    Google Scholar 

  46. Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., Byers, A.H.: Big Data: The Next Frontier for Innovation, Competition, and Productivity. McKinsey Global Institute, Washington (DC) (2011)

    Google Scholar 

  47. Card, S.K., Mackinlay, J.D., Shneiderman, B.: Information Visualization: Using Vision to Think, pp. 1–34. Morgan Kaufmann, San Francisco (1999).

    Google Scholar 

  48. Inselberg, A.: Parallel Coordinates: Visual Multidimensional Geometry and Its Applications (foreword by Ben Shneiderman). Springer, Heidelberg (2009)

    Book  Google Scholar 

  49. Novakova, L., Stepankova, O.: Radviz and identification of clusters in multidimensional data. In: 13th International Conference on Information Visualisation, pp. 104–109 (2009)

    Google Scholar 

  50. Meyer-Spradow, J., Stegger, L., Doering, C., Ropinski, T., Hinirchs, K.: Glyph-based spect visualization for the diagnosis of coronary artery disease. IEEE Trans. Visual Comput. Graphics 14(6), 1499–1506 (2008)

    Article  Google Scholar 

  51. Fox, P., Hendler, J.: Changing the equation on scientific data visualization. Science 331(6018), 705–708 (2011)

    Article  Google Scholar 

  52. de Jong, T.: Computer simulations - technological advances in inquiry learning. Science 312(5773), 532–533 (2006)

    Article  Google Scholar 

  53. Chittaro, L.: Information visualization and its application to medicine. Artif. Intell. Med. 22(2), 81–88 (2001)

    Article  Google Scholar 

  54. Johnson, C.R., MacLeod, R., Parker, S.G., Weinstein, D.: Biomedical computing and visualization software environments. Commun. ACM 47(11), 64–71 (2004)

    Article  Google Scholar 

  55. Ebner, M., Holzinger, A.: Successful implementation of user-centered game based learning in higher education an example from civil engineering. Comput. Educ. 49(3), 873–890 (2007)

    Article  Google Scholar 

  56. Kickmeier-Rust, M.D., Peirce, N., Conlan, O., Schwarz, D., Verpoorten, D., Albert, D.: Immersive Digital Games: The Interfaces for Next-Generation E-Learning?, pp. 647–656. Springer, Heidelberg (2007)

    Google Scholar 

  57. Cooper, S., Khatib, F., Treuille, A., Barbero, J., Lee, J., Beenen, M., Leaver-Fay, A., Baker, D., Popovic, Z., Players, F.: Predicting protein structures with a multiplayer online game. Nature 466(7307), 756–760 (2010)

    Article  Google Scholar 

  58. Mayer, R.E., Hegarty, M., Mayer, S., Campbell, J.: When static media promote active learning: annotated illustrations versus narrated animations in multimedia instruction. J. Exp. Psychol. Appl. 11(4), 256–265 (2005)

    Article  Google Scholar 

  59. Holzinger, A., Kickmeier-Rust, M., Albert, D.: Dynamic media in computer science education; content complexity and learning performance: is less more? Educ. Technol. Soc. 11(1), 279–290 (2008)

    Google Scholar 

  60. Hessinger, M., Holzinger, A., Leitner, D., Wassertheurer, S.: Haemodynamic models for education in physiology. Math. Comput. Simul. Simul. News Eur. 16(2), 64–68 (2006)

    Google Scholar 

  61. McDonald, D.: The relation of pulsatile pressure to flow in arteries. J. Physiol. 127, 533–552 (1955)

    Google Scholar 

  62. Womersley, J.R.: Method for the calculation of velocity, rate of flow and viscous drag in arteries when the pressure gradient is known. J. Physiol. 127(3), 553–563 (1955)

    Google Scholar 

  63. Pedley, T.: The Fluid Mechanics of Large Blood Vessels. Cambridge University Press, Cambridge (1980)

    Book  MATH  Google Scholar 

  64. Leitner, D., Wassertheurer, S., Hessinger, M., Holzinger, A.: A lattice boltzmann model for pulsative blood flow in elastic vessels. New Comput. Med. Inf. Health Care 123(4), 64–68 (2006). Special Edition of Springer e&i

    Google Scholar 

  65. Holzinger, A., Ebner, M.: Interaction and Usability of Simulations & Animations: A Case Study of the Flash Technology, pp. 777–780. IOS Press, Zurich (2003)

    Google Scholar 

  66. Holzinger, A.: Application of rapid prototyping to the user interface development for a virtual medical campus. IEEE Softw. 21(1), 92–99 (2004)

    Article  Google Scholar 

  67. Holzinger, A.: Usability engineering for software developers. Commun. ACM 48(1), 71–74 (2005)

    Article  Google Scholar 

  68. Holzinger, A., Kickmeier-Rust, M.D., Wassertheurer, S., Hessinger, M.: Learning performance with interactive simulations in medical education: lessons learned from results of learning complex physiological models with the haemodynamics simulator. Comput. Educ. 52(2), 292–301 (2009)

    Article  Google Scholar 

  69. Schrödinger, E.: What Is Life? The Physical Aspect of the Living Cell. Dublin Institute for Advanced Studies at Trinity College, Dublin (1944)

    Google Scholar 

  70. Wing, J.M.: Computational thinking. Commun. ACM 49(3), 33–35 (2006)

    Article  MathSciNet  Google Scholar 

  71. Fisher, J., Harel, D., Henzinger, T.: Biology as reactivity. Commun. ACM 54(10), 72–82 (2011)

    Article  Google Scholar 

  72. Vendruscolo, M., Dobson, C.M.: Protein dynamics: moore’s law in molecular biology. Curr. Biol. 21(2), R68–R70 (2011)

    Article  Google Scholar 

  73. Holzinger, A.: Process Guide for Students for Interdisciplinary Work in Computer Science/Informatics, 2nd edn. BoD, Norderstedt (2010)

    Google Scholar 

  74. Wing, J.M.: Computational thinking and thinking about computing. Philos. Trans. Roy. Soc. A: Math. Phys. Eng. Sci. 366(1881), 3717–3725 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  75. Holzinger, A., Dehmer, M., Jurisica, I.: Knowledge discovery and interactive data mining in  bioinformatics - state-of-the-art, future challenges and research directions. BMC Bioinform. 15(Suppl 6), I1 (2014)

    Article  Google Scholar 

  76. Simon, H.: Designing Organizations for an Information-Rich World, pp. 37–72. The Johns Hopkins Press, Baltimore (1971)

    Google Scholar 

  77. Holzinger, A.: Human-computer interaction and knowledge discovery (HCI-KDD): what is the benefit of bringing those two fields to work together? In: Cuzzocrea, A., Kittl, C., Simos, D.E., Weippl, E., Xu, L. (eds.) CD-ARES 2013. LNCS, vol. 8127, pp. 319–328. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andreas Holzinger .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Holzinger, A., Stocker, C., Dehmer, M. (2014). Big Complex Biomedical Data: Towards a Taxonomy of Data. In: Obaidat, M., Filipe, J. (eds) E-Business and Telecommunications. ICETE 2012. Communications in Computer and Information Science, vol 455. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44791-8_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-44791-8_1

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-44790-1

  • Online ISBN: 978-3-662-44791-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics