Skip to main content

Data Semantics Meets Knowledge Discovery in Databases

  • Chapter
  • First Online:
  • 1989 Accesses

Part of the book series: Studies in Big Data ((SBD,volume 31))

Abstract

In the last 30 years two important fields were born and have developed rapidly: knowledge discovery and knowledge management based on semantics. In the present chapter we provide an overview of the interlinks between them, taking the perspective of the evolution of systems and platforms supporting knowledge discovery with the help of data semantics.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://www.r-project.org/.

  2. 2.

    http://www.cs.waikato.ac.nz/ml/weka/.

  3. 3.

    http://community.rapidminer.com/.

  4. 4.

    https://www.knime.org/.

  5. 5.

    https://cran.r-project.org/.

  6. 6.

    https://www.tensorflow.org/.

  7. 7.

    https://azure.microsoft.com/en-us/services/machine-learning/.

  8. 8.

    See e.g. http://www.kdd.org/kdd2017/Calls/view/kdd-2017-call-for-research-papers, http://ecmlpkdd2017.ijs.si/submission.html, sections on reproducibility.

References

  1. A.S. Ali, O.F. Rana, I.J. Taylor, Web services composition for distributed data mining, in International Conference Workshops on Parallel Processing, 2005. ICPP 2005 Workshops (2005), pp. 11–18

    Google Scholar 

  2. S. Alsairafi, M. Ghanem, N. Giannadakis, Y. Guo, D. Kalaitzopoulos, M. Osmond, A. Rowe, J. Syed, P. Wendel, The design of discovery net: towards open grid services for knowledge discovery. Int. J. High Perform. Comput. Appl. 17(3), 297–315 (2003)

    Article  Google Scholar 

  3. O. Arieli, A. Zamansky, A graded approach to database repair by context-aware distance semantics. Fuzzy Sets Syst. 298, 4–21 (2016)

    Article  MathSciNet  Google Scholar 

  4. S. Basu, R.J. Mooney, K.V. Pasupuleti, J. Ghosh, Evaluating the novelty of text-mined rules using lexical knowledge, in Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2001), pp. 233–238

    Google Scholar 

  5. A. Bernstein, F. Provost, S. Hill, Towards intelligent assistance for a data mining process: an ontology based approach for cost-sensitive classification. IEEE Trans. Knowl. Data Eng. 17(4), 503–518 (2005)

    Article  Google Scholar 

  6. A. Bernstein, M. Dänzer, The next system: towards true dynamic adaptations of semantic web service compositions, in Proceedings of the 4th European conference on The Semantic Web: Research and Applications, ESWC ’07 (Springer, Berlin, 2007), pp. 739–748

    Google Scholar 

  7. L. Brisson, M. Collard, How to semantically enhance a data mining process?, in ICEIS, ed. by J. Filipe, J. Cordeiro, Lecture Notes in Business Information Processing, vol. 19, (Springer, Berlin, 2008), pp. 103–116

    Google Scholar 

  8. S. Brüggemann, H.J. Appelrath, Context-aware replacement operations for data cleaning, in Proceedings of the 2011 ACM Symposium on Applied Computing, SAC ’11 (2011), pp. 1700–1704

    Google Scholar 

  9. M. Cannataro, C. Comito, A data mining ontology for grid programming, in Proceedings of the 1st International Workshop on Semantics in Peer-to-Peer and Grid Computing (Budapest, Hungary, 2003), pp. 113–134

    Google Scholar 

  10. L. Cao, Y. Zhao, H. Zhang, D. Luo, C. Zhang, E.K. Park, Flexible frameworks for actionable knowledge discovery. IEEE Trans. Knowl. Data Eng. 22(9), 1299–1312 (2010)

    Article  Google Scholar 

  11. H. Cespivova, J. Rauch, V. Svatek, M. Kejkula, M. Tomeckova, Roles of medical ontologies in association mining CRISP-DM cycle, in ECML/PKDD Workshop on Knowledge Discovery and Ontologies (Italy, Pisa, 2004), pp. 1–12

    Google Scholar 

  12. J. Chattratichat, J. Darlington, Y. Guo, S. Hedvall, M. Köler, J. Syed, An architecture for distributed enterprise data mining, in Proceedings of the 7th International Conference on High-Performance Computing and Networking, HPCN Europe ’99 (Springer, London, 1999), pp. 573–582

    Google Scholar 

  13. W.K. Cheung, X.F. Zhang, H. fai Wong, J. Liu, Z.W. Luo, F.C.H. Tong, Service-oriented distributed data mining. IEEE Internet Comput. 10, 44–54 (2006)

    Google Scholar 

  14. C. Diamantini, M. Panti, D. Potena, Una piattaforma per servizi di KDD, in Proceedings of the 11th Italian Symposium on Advanced Database Systems (2003), pp. 119–130

    Google Scholar 

  15. C. Diamantini, D. Potena, E. Storti, KDDONTO: an ontology for discovery and composition of KDD algorithms, in Proceedings of the ECML/PKDD09 Workshop on Third Generation Data Mining: Towards Service-oriented Knowledge Discovery (Bled, Slovenia, 2009), pp. 13–24

    Google Scholar 

  16. C. Diamantini, D. Potena, E. Storti, A virtual mart for knowledge discovery in databases. Inf. Syst. Front. 15(3), 447–463 (2013)

    Article  Google Scholar 

  17. S. Džeroski, Towards a general framework for data mining, in Proceedings of the 5th International Conference on Knowledge Discovery in Inductive Databases (Springer, Berlin, 2007), pp. 259–300

    Google Scholar 

  18. J. Elder, D. Abbott, A comparison of leading data mining tools, in Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining (1998)

    Google Scholar 

  19. U.M. Fayyad, G. Piatetsky-shapiro, P. Smyth, From Data Mining to Knowledge Discovery: An Overview (American Association for Artificial Intelligence, Menlo Park, 1996), pp. 1–34

    Google Scholar 

  20. S. Ghosh, S. Mitra, R. Dattagupt, Fuzzy clustering with biological knowledge for gene selection. Appl. Soft Comput. 16, 102–111 (2014)

    Article  Google Scholar 

  21. M. Goebel, L. Gruenwald, A survey of data mining and knowledge discovery software tools. ACM SIGKDD Explor. 1(1), 20–33 (1999)

    Article  Google Scholar 

  22. R. Grossman, S. Bailey, A. Ramu, B. Malhi, P. Hallstrom, I. Pulleyn, X. Qin, The management and mining of multiple predictive models using the predictive modeling markup language. Inf. Softw. Technol. 41(9), 589–595 (1999)

    Article  Google Scholar 

  23. A. Guazzelli, M. Zeller, W. Lin, G. Williams, PMML: an open standard for sharing models. R J. 1(1), 60–65 (2009)

    Google Scholar 

  24. J. Han, Y. Fu, Mining multiple-level association rules in large databases. IEEE Trans. Knowl. Data Eng. 11(5), 798–805 (1999) (previously published in Proc. of the 21st VLDB Conference, Zurich, Switzerland 1995)

    Google Scholar 

  25. R. Helaoui, D. Riboni, H. Stuckenschmidt, A probabilistic ontological framework for the recognition of multilevel human activities, in Proceedings of the 2013 ACM International Joint Conference on Pervasive and Ubiquitous Computing, UbiComp ’13 (ACM, 2013), pp. 345–354

    Google Scholar 

  26. KDDVM project site, http://kdmg.dii.univpm.it/?q=KDDVM

  27. J. Kiets, F. Serban, A. Bernstein, S. Fisher, Towards cooperative planning of data mining workflows, in Proceedings of the ECML/PKDD09 Workshop on Third Generation Data Mining: Towards Service-oriented Knowledge Discovery (Bled, Slovenia, 2009), pp. 1–12

    Google Scholar 

  28. J. Kranjc, R. Ora, V. Podpean, N. Lavra, M. Robnik-ikonja, Clowdflows: online workflows for distributed big data mining. Future Gener. Comput. Syst. 68, 38–58 (2017)

    Article  Google Scholar 

  29. A. Kumar, M.M. Kantardzic, P. Ramaswamy, P. Sadeghian, An extensible service oriented distributed data mining framework, in Proceedings of the International Conference on Machine Learning and Applications (Louisville, KY, USA, 2004), pp. 256–263

    Google Scholar 

  30. Y. Li, M.A. Thomas, K.M. Osei-Bryson, Ontology-based data mining model management for self-service knowledge discovery. Inf. Syst. Front. 1–19 (2016)

    Google Scholar 

  31. L. Kart, G. Herschel, A. Linden, J. Hare, Magic quadrant for advanced analytics platforms Technical report, Gartner Inc. (2016)

    Google Scholar 

  32. S. Majithia, M.S. Shields, I.J. Taylor, I. Wang, Triana: a graphical web service composition and execution toolkit, in Proceedings of IEEE International Conference on Web Services (2004), pp. 514–521

    Google Scholar 

  33. G.L. Martiny, A. Unruhy, S.D. Urbanz, An agent infrastructure for knowledge discovery and event detection. Technical Report MCC-INSL-003-99, Microelectronics and Computer Technology Corporation (1999)

    Google Scholar 

  34. K. Morik, M. Scholz, The miningmart approach to knowledge discovery in databases, in Intelligent Technologies for Information Analysis, ed. by N. Zhong, J. Liu (Springer, Berlin, 2004), pp. 47–65

    Chapter  Google Scholar 

  35. D.O.G. Neto, W. Meira, R. Ferreira, Anteater: a service-oriented architecture for high-performance data mining. IEEE Internet Comput. 10, 36–43 (2006)

    Article  Google Scholar 

  36. R. Olejnik, T.F. Fortis, B. Toursel, Webservices oriented data mining in knowledge architecture. Future Gener. Comput. Syst. 25(4), 436–443 (2009)

    Article  Google Scholar 

  37. P. Panov, L. Soldatova, S. Džeroski, Ontology of core data mining entities. Data Min. Knowl. Discov. 28(5), 1222–1265 (2014)

    Article  MATH  Google Scholar 

  38. S. Parthasarathy, R. Subramonian, Facilitating data mining on a network of workstations, in Advances in Distributed and Parallel Knowledge Discovery, ed. by H. Kargupta, P. Chan (AAAI/MIT Press, Menlo Park, 2000), pp. 233–258

    Google Scholar 

  39. H. Paulheim, Exploiting linked open data as background knowledge in data mining, in ECML/PKDD Workshop on Data Mining on Linked Data. CEUR Workshop Proceedings, vol. 2013 (1082), pp. 345–354

    Google Scholar 

  40. D. Perez-Rey, A. Anguita, J. Crespo, Ontodataclean: ontology-based integration and preprocessing of distributed data, in Biological and Medical Data Analysis: 7th International Symposium, ISBMDA 2006, Thessaloniki, Greece, December 7–8, 2006, Proceedings (Springer, Berlin, 2006)

    Google Scholar 

  41. J. Phillips, B. Buchanan, Ontology-guided knowledge discovery in databases, in 1st ACM International Conference on Knowledge Capture (Victoria, Canada, 2001), pp. 123–130

    Google Scholar 

  42. G. Piatetsky-Shapiro, Knowledge discovery in real databases: a report on the IJCAI-89 workshop. AI Mag. 11(5), 68–70 (1991)

    Google Scholar 

  43. J. Pivarski, C. Bennett, R.L. Grossman, Deploying analytics with the portable format for analytics (PFA), in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2016), pp. 579–588

    Google Scholar 

  44. V. Podpecan, M. Zemenova, N. Lavrac, Orange4WS environment for service-oriented data mining. Comput. J. 55(1), 82–98 (2011)

    Article  Google Scholar 

  45. M.S. Pŕez, A. Sánchez, V. Robles, P. Herrero, J.M.P. na, Design and implementation of a data mining grid-aware architecture. Future Gener. Comput. Syst. 23(1), 42–47 (2007)

    Google Scholar 

  46. O. Rana, D. Walker, M. Li, S. Lynden, M. Ward, PaDDMAS: parallel and distributed data mining application suite, in 14th International Parallel and Distributed Processing Symposium (Cancun, 2000), pp. 387–392

    Google Scholar 

  47. P. Ristoski, C. Bizer, H. Paulheim, Mining the web of linked data with rapidminer. Web Semant.: Sci. Serv. Agents World Wide Web 35(Part 3), 142–151 (2015)

    Google Scholar 

  48. P. Ristoski, H. Paulheim, Semantic web in data mining and knowledge discovery: a comprehensive survey. Web Semant.: Sci. Serv. Agents World Wide Web 36, 1–22 (2016)

    Article  Google Scholar 

  49. O. Ritthoff, R. Klinkenberg, S. Fischer, I. Mierswa, S. Felske, Yale: yet another learning environment, in Proceedings of LLWA01/FGML-2001 (2001), pp. 84–92

    Google Scholar 

  50. A. Romei, S. Ruggieri, F. Turini, KDDML: a middleware language and system for knowledge discovery in databases. Data Knowl. Eng. 57, 179–220 (2006)

    Article  Google Scholar 

  51. S. Sarawagi, S.H. Nagaralu, Data mining models as services on the internet. SIGKDD Explor. Newsl. 2(1), 24–28 (2000)

    Article  Google Scholar 

  52. F. Serban, J. Vanschoren, J.U. Kietz, A. Bernstein, A survey of intelligent assistants for data analysis. ACM Comput. Surv. 45(3), 31:1–31:35 (2013)

    Google Scholar 

  53. C. Shearer, The CRISP-DM Model: the new blueprint for data mining. J. Data Warehous. 5(4), 13–22 (2000)

    Google Scholar 

  54. R. Srikant, R. Agrawal, Mining generalized association rules. Future Gener. Comput. Syst. 13(2), 161–180 (1997) (previously published in Proceedings of the 21st VLDB Conference, Zurich, Switzerland 1995)

    Google Scholar 

  55. D. Talia, The open grid services architecture: where the grid meets the web. IEEE Internet Comput. 6(6), 67–71 (2002)

    Article  Google Scholar 

  56. C.Y. Tsai, M.H. Tsai, A dynamic web service based data mining process system, in Proceedings of the 5th International Conference on Computer and Information Technology (IEEE Computer Society, 2005), pp. 1033–1039

    Google Scholar 

  57. C. Vicient, D. Snchez, A. Moreno, An automatic approach for ontology-based feature extraction from heterogeneous textual resources. Eng. Appl. Artif. Intell. 26(3), 1092–1106 (2013)

    Article  Google Scholar 

  58. C. Wan, A.A. Freitas, An empirical evaluation of hierarchical feature selection methods for classification in bioinformatics datasets with gene ontology-based features. Artif. Intell. Rev. 1–40 (2017)

    Google Scholar 

  59. Y. Wang, S. Yang, Outlier detection from massive short documents using domain ontology, in 2010 IEEE International Conference on Intelligent Computing and Intelligent Systems, vol. 3 (2010), pp. 558–562

    Google Scholar 

  60. R. Wirth, C. Shearer, U. Grimmer, T.P. Reinartz, J. Schlsser, C. Breitner, R. Engels, G. Lindner, Towards process-oriented tool support for knowledge discovery in databases, in PKDD ’97: Proceedings of the First European Symposium on Principles of Data Mining and Knowledge Discovery (Springer, London, 1997), pp. 243–253

    Google Scholar 

  61. I.H. Witten, E. Frank, Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. (Morgan Kaufmann, San Francisco, 2005)

    MATH  Google Scholar 

  62. L. Yu-hua, L. Zheng-ding, S. Xiao-lin, W. Kun-mei, L. Rui-xuan, Data mining ontology development for high user usability. Wuhan Univ. J. Nat. Sci. 11(1), 51–56 (2006)

    Article  Google Scholar 

  63. M. Žáková, P. Křemen, F. Železný, N. Lavrač, Automating knowledge discovery workflow composition through ontology-based planning. IEEE Trans. Autom. Sci. Eng. 8(2), 253–264 (2011)

    Article  Google Scholar 

  64. X. Zhu, J. Yang, An Extended Predictive Model Markup Language for Data Mining (Springer, Berlin, 2010), pp. 218–231

    Google Scholar 

  65. X. Zhu, H. Wang, H. Gan, C. Gao, Construction and management of automatical reasoning supported data mining metadata, in 2011 International Conference on Business Management and Electronic Information, vol. 5 (2011), pp. 205–210

    Google Scholar 

  66. L. Zhu, C. Xu, J. Guan, H. Zhang, SEM-PPA: a semantical pattern and preference-aware service mining method for personalized point of interest recommendation. J. Netw. Comput. Appl. 82, 35–46 (2017)

    Article  Google Scholar 

  67. M. Ziaeefard, R. Bergevin, Semantic human activity recognition: a literature review. Pattern Recognit. 48(8), 2329–2345 (2015)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Domenico Potena .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this chapter

Cite this chapter

Diamantini, C., Potena, D., Storti, E. (2018). Data Semantics Meets Knowledge Discovery in Databases. In: Flesca, S., Greco, S., Masciari, E., Saccà, D. (eds) A Comprehensive Guide Through the Italian Database Research Over the Last 25 Years. Studies in Big Data, vol 31. Springer, Cham. https://doi.org/10.1007/978-3-319-61893-7_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-61893-7_23

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-61892-0

  • Online ISBN: 978-3-319-61893-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics