Skip to main content

Knowledge Discovery and Data Mining in Biomedical Informatics: The Future Is in Integrative, Interactive Machine Learning Solutions

  • Chapter
Interactive Knowledge Discovery and Data Mining in Biomedical Informatics

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8401))

Abstract

Biomedical research is drowning in data, yet starving for knowledge. Current challenges in biomedical research and clinical practice include information overload – the need to combine vast amounts of structured, semi-structured, weakly structured data and vast amounts of unstructured information – and the need to optimize workflows, processes and guidelines, to increase capacity while reducing costs and improving efficiencies. In this paper we provide a very short overview on interactive and integrative solutions for knowledge discovery and data mining. In particular, we emphasize the benefits of including the end user into the “interactive” knowledge discovery process. We describe some of the most important challenges, including the need to develop and apply novel methods, algorithms and tools for the integration, fusion, pre-processing, mapping, analysis and interpretation of complex biomedical data with the aim to identify testable hypotheses, and build realistic models. The HCI-KDD approach, which is a synergistic combination of methodologies and approaches of two areas, Human–Computer Interaction (HCI) and Knowledge Discovery & Data Mining (KDD), offer ideal conditions towards solving these challenges: with the goal of supporting human intelligence with machine intelligence. There is an urgent need for integrative and interactive machine learning solutions, because no medical doctor or biomedical researcher can keep pace today with the increasingly large and complex data sets – often called “Big Data”.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Simon, H.A.: Designing Organizations for an Information-Rich World. In: Greenberger, M. (ed.) Computers, Communication, and the Public Interest, pp. 37–72. The Johns Hopkins Press, Baltimore (1971)

    Google Scholar 

  2. Dugas, M., Hoffmann, E., Janko, S., Hahnewald, S., Matis, T., Miller, J., Bary, C.V., Farnbacher, A., Vogler, V., Überla, K.: Complexity of biomedical data models in cardiology: the Intranet-based AF registry. Computer Methods and Programs in Biomedicine 68(1), 49–61 (2002)

    Article  Google Scholar 

  3. Akil, H., Martone, M.E., Van Essen, D.C.: Challenges and opportunities in mining neuroscience data. Science 331(6018), 708–712 (2011)

    Article  Google Scholar 

  4. Holzinger, A.: Biomedical Informatics: Computational Sciences meets Life Sciences. BoD, Norderstedt (2012)

    Google Scholar 

  5. Holzinger, A.: Biomedical Informatics: Discovering Knowledge in Big Data. Springer, New York (2014)

    Book  MATH  Google Scholar 

  6. Berghel, H.: Cyberspace 2000: Dealing with Information Overload. Communications of the ACM 40(2), 19–24 (1997)

    Article  Google Scholar 

  7. Noone, J., Warren, J., Brittain, M.: Information overload: opportunities and challenges for the GP’s desktop. Medinfo 9(2), 1287–1291 (1998)

    Google Scholar 

  8. Holzinger, A., Geierhofer, R., Errath, M.: Semantic Information in Medical Information Systems - from Data and Information to Knowledge: Facing Information Overload. In: Procedings of I-MEDIA 2007 and I-SEMANTICS 2007, pp. 323–330 (2007)

    Google Scholar 

  9. Holzinger, A., Simonic, K.-M., Steyrer, J.: Information Overload - stößt die Medizin an ihre Grenzen? Wissensmanagement 13(1), 10–12 (2011)

    Google Scholar 

  10. Holzinger, A., Scherer, R., Ziefle, M.: Navigational User Interface Elements on the Left Side: Intuition of Designers or Experimental Evidence? In: Campos, P., Graham, N., Jorge, J., Nunes, N., Palanque, P., Winckler, M. (eds.) INTERACT 2011, Part II. LNCS, vol. 6947, pp. 162–177. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  11. Holzinger, A., Dehmer, M., Jurisica, I.: Knowledge Discovery and interactive Data Mining in Bioinformatics - State-of-the-Art, future challenges and research directions. BMC Bioinformatics 15(suppl. 6), I1 (2014)

    Google Scholar 

  12. Shneiderman, B.: Inventing Discovery Tools: Combining Information Visualization with Data Mining. In: Jantke, K.P., Shinohara, A. (eds.) DS 2001. LNCS (LNAI), vol. 2226, pp. 17–28. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  13. Shneiderman, B.: Inventing Discovery Tools: Combining Information Visualization with Data Mining. Information Visualization 1(1), 5–12 (2002)

    Article  MATH  Google Scholar 

  14. Shneiderman, B.: Creativity support tools. Communications of the ACM 45(10), 116–120 (2002)

    Article  Google Scholar 

  15. Shneiderman, B.: Creativity support tools: accelerating discovery and innovation. Communications of the ACM 50(12), 20–32 (2007)

    Article  Google Scholar 

  16. Butler, D.: 2020 computing: Everything, everywhere. Nature 440(7083), 402–405 (2006)

    Article  Google Scholar 

  17. Chaudhry, B., Wang, J., Wu, S.Y., Maglione, M., Mojica, W., Roth, E., Morton, S.C., Shekelle, P.G.: Systematic review: Impact of health information technology on quality, efficiency, and costs of medical care. Ann. Intern. Med. 144(10), 742–752 (2006)

    Article  Google Scholar 

  18. Chawla, N.V., Davis, D.A.: Bringing Big Data to Personalized Healthcare: A Patient-Centered Framework. J. Gen. Intern. Med. 28, S660–S665 (2013)

    Google Scholar 

  19. Mirnezami, R., Nicholson, J., Darzi, A.: Preparing for Precision Medicine. N. Engl. J. Med. 366(6), 489–491 (2012)

    Article  Google Scholar 

  20. Sackett, D.L., Rosenberg, W.M., Gray, J., Haynes, R.B., Richardson, W.S.: Evidence based medicine: what it is and what it isn’t. BMJ: British Medical Journal 312(7023), 71 (1996)

    Article  Google Scholar 

  21. Fayyad, U., Piatetsky-Shapiro, G., Smyth, P.: The KDD process for extracting useful knowledge from volumes of data. Communications of the ACM 39(11), 27–34 (1996)

    Article  Google Scholar 

  22. Jurisica, I., Mylopoulos, J., Glasgow, J., Shapiro, H., Casper, R.F.: Case-based reasoning in IVF: prediction and knowledge mining. Artificial Intelligence in Medicine 12(1), 1–24 (1998)

    Article  Google Scholar 

  23. Yildirim, P., Ekmekci, I.O., Holzinger, A.: On Knowledge Discovery in Open Medical Data on the Example of the FDA Drug Adverse Event Reporting System for Alendronate (Fosamax). In: Holzinger, A., Pasi, G. (eds.) HCI-KDD 2013. LNCS, vol. 7947, pp. 195–206. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  24. Gruber, T.R.: Toward principles for the design of ontologies used for knowledge sharing. International Journal of Human-Computer Studies 43(5-6), 907–928 (1995)

    Article  Google Scholar 

  25. Pinciroli, F., Pisanelli, D.M.: The unexpected high practical value of medical ontologies. Computers in Biology and Medicine 36(7-8), 669–673 (2006)

    Article  Google Scholar 

  26. Eiter, T., Ianni, G., Polleres, A., Schindlauer, R., Tompits, H.: Reasoning with rules and ontologies. In: Barahona, P., Bry, F., Franconi, E., Henze, N., Sattler, U. (eds.) Reasoning Web 2006. LNCS, vol. 4126, pp. 93–127. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  27. Tjoa, A.M., Andjomshoaa, A., Shayeganfar, F., Wagner, R.: Semantic Web challenges and new requirements. In: Database and Expert Systems Applications (DEXA), pp. 1160–1163. IEEE (2005)

    Google Scholar 

  28. d’Aquin, M., Noy, N.F.: Where to publish and find ontologies? A survey of ontology libraries. Web Semantics: Science, Services and Agents on the World Wide Web 11, 96–111 (2012)

    Article  Google Scholar 

  29. Ruttenberg, A., Clark, T., Bug, W., Samwald, M., Bodenreider, O., Chen, H., Doherty, D., Forsberg, K., Gao, Y., Kashyap, V., Kinoshita, J., Luciano, J., Marshall, M.S., Ogbuji, C., Rees, J., Stephens, S., Wong, G.T., Wu, E., Zaccagnini, D., Hongsermeier, T., Neumann, E., Herman, I., Cheung, K.H.: Methodology - Advancing translational research with the Semantic Web. BMC Bioinformatics 8 (2007)

    Google Scholar 

  30. Shortliffe, E.H., Barnett, G.O.: Biomedical data: Their acquisition, storage, and use. Biomedical informatics, pp. 39–66. Springer, London (2014)

    Google Scholar 

  31. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edn. Springer, New York (2009)

    Book  MATH  Google Scholar 

  32. Witten, I.H., Frank, E., Hall, M.A.: Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann, San Francisco (2011)

    Google Scholar 

  33. Arel, I., Rose, D.C., Karnowski, T.P.: Deep Machine Learning - A New Frontier in Artificial Intelligence Research [Research Frontier]. IEEE Computational Intelligence Magazine 5(4), 13–18 (2010)

    Article  Google Scholar 

  34. Dietterich, T.G.: Ensemble methods in machine learning. Multiple classifier systems, pp. 1–15. Springer (2000)

    Google Scholar 

  35. Rokach, L.: Ensemble-based classifiers. Artif. Intell. Rev. 33(1-2), 1–39 (2010)

    Article  Google Scholar 

  36. Card, S.K., Moran, T.P., Newell, A.: The keystroke-level model for user performance time with interactive systems. Communications of the ACM 23(7), 396–410 (1980)

    Article  Google Scholar 

  37. Card, S.K., Moran, T.P., Newell, A.: The psychology of Human-Computer Interaction. Erlbaum, Hillsdale (1983)

    Google Scholar 

  38. Sanchez, C., Lachaize, C., Janody, F., Bellon, B., Roder, L., Euzenat, J., Rechenmann, F., Jacq, B.: Grasping at molecular interactions and genetic networks in Drosophila melanogaster using FlyNets, an Internet database. Nucleic Acids Res. 27(1), 89–94 (1999)

    Article  Google Scholar 

  39. McNeil, B.J., Keeler, E., Adelstein, S.J.: Primer on Certain Elements of Medical Decision Making. N. Engl. J. Med. 293(5), 211–215 (1975)

    Article  Google Scholar 

  40. Sweller, J.: Cognitive load during problem solving: Effects on learning. Cognitive Science 12(2), 257–285 (1988)

    Article  Google Scholar 

  41. Stickel, C., Ebner, M., Holzinger, A.: Useful Oblivion Versus Information Overload in e-Learning Examples in the Context of Wiki Systems. Journal of Computing and Information Technology (CIT) 16(4), 271–277 (2008)

    Article  Google Scholar 

  42. Workman, M.: Cognitive Load Research and Semantic Apprehension of Graphical Linguistics. In: Holzinger, A. (ed.) USAB 2007. LNCS, vol. 4799, pp. 375–388. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  43. Mitchell, T.M.: Machine learning, p. 267. McGraw-Hill, Boston (1997)

    MATH  Google Scholar 

  44. Shortliffe, E.H., Perrault, L.E., Wiederhold, G., Fagan, L.M.: Medical Informatics: Computer Applications in Health Care and Biomedicine. Springer, New York (1990)

    Google Scholar 

  45. Holzinger, A.: Usability engineering methods for software developers. Communications of the ACM 48(1), 71–74 (2005)

    Article  Google Scholar 

  46. Keim, D.A.: Information visualization and visual data mining. IEEE Transactions on Visualization and Computer Graphics 8(1), 1–8 (2002)

    Article  MathSciNet  Google Scholar 

  47. Gotz, D., Wang, F., Perer, A.: A methodology for interactive mining and visual analysis of clinical event patterns using electronic health record data. J. Biomed. Inform. (in print, 2014)

    Google Scholar 

  48. Pastrello, C., Pasini, E., Kotlyar, M., Otasek, D., Wong, S., Sangrar, W., Rahmati, S., Jurisica, I.: Integration, visualization and analysis of human interactome. Biochemical and Biophysical Research Communications 445(4), 757–773 (2014)

    Article  Google Scholar 

  49. Dehmer, M.: Information-theoretic concepts for the analysis of complex networks. Applied Artificial Intelligence 22(7-8), 684–706 (2008)

    Article  Google Scholar 

  50. Pastrello, C., Otasek, D., Fortney, K., Agapito, G., Cannataro, M., Shirdel, E., Jurisica, I.: Visual Data Mining of Biological Networks: One Size Does Not Fit All. PLoS Computational Biology 9(1), e1002833 (2013)

    Google Scholar 

  51. Bowman, I., Joshi, S.H., Van Horn, J.D.: Visual systems for interactive exploration and mining of large-scale neuroimaging data archives. Frontiers in Neuroinformatics 6(11) (2012)

    Google Scholar 

  52. Kolling, J., Langenkamper, D., Abouna, S., Khan, M., Nattkemper, T.W.: WHIDE–a web tool for visual data mining colocation patterns in multivariate bioimages. Bioinformatics 28(8), 1143–1150 (2012)

    Article  Google Scholar 

  53. Wegman, E.J.: Visual data mining. Stat. Med. 22(9), 1383–1397 (2003)

    Article  Google Scholar 

  54. Holzinger, A.: Human-Computer Interaction and Knowledge Discovery (HCI-KDD): What Is the Benefit of Bringing Those Two Fields to Work Together? In: Cuzzocrea, A., Kittl, C., Simos, D.E., Weippl, E., Xu, L. (eds.) CD-ARES 2013. LNCS, vol. 8127, pp. 319–328. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  55. Lovell, M.C.: Data Mining. Review of Economics and Statistics 65(1), 1–12 (1983)

    Article  Google Scholar 

  56. Mooers, C.N.: Information retrieval viewed as temporal signalling. In: Proc. Internatl. Congr. of Mathematicians, August 30-September 6, p. 572 (1950)

    Google Scholar 

  57. Mooers, C.N.: The next twenty years in information retrieval; some goals and predictions. American Documentation 11(3), 229–236 (1960)

    Article  Google Scholar 

  58. Piatetsky-Shapiro, G.: Knowledge Discovery in Real Databases - A report on the IJCAI-89 Workshop. AI Magazine 11(5), 68–70 (1991)

    Google Scholar 

  59. Fayyad, U., Piatetsky-Shapiro, G., Smyth, P.: From data mining to knowledge discovery in databases. Ai Magazine 17(3), 37–54 (1996)

    Google Scholar 

  60. Holzinger, A., Malle, B., Bloice, M., Wiltgen, M., Ferri, M., Stanganelli, I., Hofmann-Wellenhof, R.: On the Generation of Point Cloud Data Sets: the first step in the Knowledge Discovery Process. In: Holzinger, A., Jurisica, I. (eds.) Knowledge Discovery and Data Mining. LNCS, vol. 8401, pp. 57–80. Springer, Heidelberg (2014)

    Google Scholar 

  61. Boselli, R., Cesarini, M., Mercorio, F., Mezzanzanica, M.: A Policy-based Cleansing and Integration Framework for Labour and Healthcare Data. In: Holzinger, A., Jurisica, I. (eds.) Knowledge Discovery and Data Mining. LNCS, vol. 8401, pp. 141–168. Springer, Heidelberg (2014)

    Google Scholar 

  62. Nguyen, H., Thompson, J.D., Schutz, P., Poch, O.: Intelligent integrative knowledge bases: bridging genomics, integrative biology and translational medicine. In: Holzinger, A., Jurisica, I. (eds.) Knowledge Discovery and Data Mining. LNCS, vol. 8401, pp. 255–270. Springer, Heidelberg (2014)

    Google Scholar 

  63. Huppertz, B., Holzinger, A.: Biobanks – A Source of large Biological Data Sets: Open Problems and Future Challenges. In: Holzinger, A., Jurisica, I. (eds.) Knowledge Discovery and Data Mining, vol. 8401, pp. 317–330. Springer, Heidelberg (2014)

    Google Scholar 

  64. Holzinger, K., Palade, V., Rabadan, R., Holzinger, A.: Darwin or Lamarck? Future Challenges in Evolutionary Algorithms for Knowledge Discovery and Data Mining. In: Holzinger, A., Jurisica, I. (eds.) Knowledge Discovery and Data Mining. LNCS, vol. 8401, pp. 35–56. Springer, Heidelberg (2014)

    Google Scholar 

  65. Katz, G., Shabtai, A., Rokach, L.: Adapted Features and Instance Selection for Improving Co-Training. In: Holzinger, A., Jurisica, I. (eds.) Knowledge Discovery and Data Mining. LNCS, vol. 8401, pp. 81–100. Springer, Heidelberg (2014)

    Google Scholar 

  66. Yildirim, P., Bloice, M., Holzinger, A.: Knowledge Discovery & Visualization of Clusters for Erythromycin Related Adverse Events in the FDA Drug Adverse Event Reporting System. In: Holzinger, A., Jurisica, I. (eds.) Knowledge Discovery and Data Mining. LNCS, vol. 8401, pp. 101–116. Springer, Heidelberg (2014)

    Google Scholar 

  67. Kobayashi, M.: Resources for Studying Statistical Analysis of Biomedical Data and R. In: Holzinger, A., Jurisica, I. (eds.) Knowledge Discovery and Data Mining. LNCS, vol. 8401, pp. 183–195. Springer, Heidelberg (2014)

    Google Scholar 

  68. Windridge, D., Bober, M.: A Kernel-based Framework for Medical Big-Data Analytics. In: Holzinger, A., Jurisica, I. (eds.) Knowledge Discovery and Data Mining. LNCS, vol. 8401, pp. 197–208. Springer, Heidelberg (2014)

    Google Scholar 

  69. Holzinger, A., Schantl, J., Schroettner, M., Seifert, C., Verspoor, K.: Biomedical Text Mining: Open Problems and Future Challenges. In: Holzinger, A., Jurisica, I. (eds.) Knowledge Discovery and Data Mining. LNCS, vol. 8401, pp. 271–300. Springer, Heidelberg (2014)

    Google Scholar 

  70. Holzinger, A., Ofner, B., Dehmer, M.: Multi-touch Graph-Based Interaction for Knowledge Discovery on Mobile Devices: State-of-the-Art and Future Challenges. In: Holzinger, A., Jurisica, I. (eds.) Knowledge Discovery and Data Mining. LNCS, vol. 8401, pp. 241–254. Springer, Heidelberg (2014)

    Google Scholar 

  71. Lee, S.: Sparse Inverse Covariance Estimation for Graph Representation of Feature Structure. In: Holzinger, A., Jurisica, I. (eds.) Knowledge Discovery and Data Mining. LNCS, vol. 8401, pp. 227–240. Springer, Berlin (2014)

    Google Scholar 

  72. Holzinger, A., Hortenhuber, M., Mayer, C., Bachler, M., Wassertheurer, S., Pinho, A., Koslicki, D.: On Entropy-based Data Mining. In: Holzinger, A., Jurisica, I. (eds.) Knowledge Discovery and Data Mining. LNCS, vol. 8401, pp. 209–226. Springer, Heidelberg (2014)

    Google Scholar 

  73. Holzinger, A.: Topological Data Mining in a Nutshell. In: Holzinger, A., Jurisica, I. (eds.) Knowledge Discovery and Data Mining. LNCS, vol. 8401, pp. 331–356. Springer, Heidelberg (2014)

    Google Scholar 

  74. Otasek, D., Pastrello, C., Holzinger, A., Jurisica, I.: Visual Data Mining: Effective Exploration ofthe Biological Universe. In: Holzinger, A., Jurisica, I. (eds.) Knowledge Discovery and Data Mining. LNCS, vol.  8401, pp. 19–33. Springer, Heidelberg (2014)

    Google Scholar 

  75. Turkay, C., Jeanquartier, F., Holzinger, A., Hauser, H.: On Computationally-enhanced Visual Analysis of Heterogeneous Data and its Application in Biomedical Informatics. In: Holzinger, A., Jurisica, I. (eds.) Knowledge Discovery and Data Mining. LNCS, vol. 8401, pp. 117–140. Springer, Heidelberg (2014)

    Google Scholar 

  76. van Leeuwen, M.: Interactive Data Exploration using Pattern Mining. In: Holzinger, A., Jurisica, I. (eds.) Knowledge Discovery and Data Mining. LNCS, vol. 8401, pp. 169–182. Springer, Heidelberg (2014)

    Google Scholar 

  77. Kieseberg, P., Hobel, H., Schrittwieser, S., Weippl, E., Holzinger, A.: Protecting Anonymity in the Data-Driven Medical Sciences. In: Holzinger, A., Jurisica, I. (eds.) Knowledge Discovery and Data Mining. LNCS, vol. 8401, pp. 301–316. Springer, Heidelberg (2014)

    Google Scholar 

  78. Gigerenzer, G.: Gut Feelings: Short Cuts to Better Decision Making. Penguin, London (2008)

    Google Scholar 

  79. Gigerenzer, G., Gaissmaier, W.: Heuristic Decision Making. In: Fiske, S.T., Schacter, D.L., Taylor, S.E. (eds.) Annual Review of Psychology, vol. 62, pp. 451–482. Annual Reviews, Palo Alto (2011)

    Google Scholar 

  80. Fang, F.C., Steen, R.G., Casadevall, A.: Misconduct accounts for the majority of retracted scientific publications. Proc. Natl. Acad. Sci. U.S.A 109(42), 17028–17033 (2012)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Holzinger, A., Jurisica, I. (2014). Knowledge Discovery and Data Mining in Biomedical Informatics: The Future Is in Integrative, Interactive Machine Learning Solutions. In: Holzinger, A., Jurisica, I. (eds) Interactive Knowledge Discovery and Data Mining in Biomedical Informatics. Lecture Notes in Computer Science, vol 8401. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-43968-5_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-43968-5_1

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-43967-8

  • Online ISBN: 978-3-662-43968-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics