Skip to main content

How to Semantically Enhance a Data Mining Process?

  • Conference paper
Enterprise Information Systems (ICEIS 2008)

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 19))

Included in the following conference series:

  • 568 Accesses

Abstract

This paper presents the KEOPS data mining methodology centered on domain knowledge integration. KEOPS is a CRISP-DM compliant methodology which integrates a knowledge base and an ontology. In this paper, we focus first on the pre-processing steps of business understanding and data understanding in order to build an ontology driven information system (ODIS). Then we show how the knowledge base is used for the post-processing step of model interpretation. We detail the role of the ontology and we define a part-way interestingness measure that integrates both objective and subjective criteria in order to eval model relevance according to expert knowledge. We present experiments conducted on real data and their results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C., Wirth, R.: Crisp-dm 1.0: Step-by-step data mining guide. In: SPSS Inc. (2000)

    Google Scholar 

  2. Kedad, Z., Métais, E.: Ontology-based data cleaning. In: Andersson, B., Bergholtz, M., Johannesson, P. (eds.) NLDB 2002. LNCS, vol. 2553, pp. 137–149. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  3. McGarry, K.: A survey of interestingness measures for knowledge discovery. Knowl. Eng. Rev. 20, 39–61 (2005)

    Article  Google Scholar 

  4. Guarino, N.: Formal Ontology in Information Systems. IOS Press, Amsterdam (1998); Amended version of previous one in Proceedings of the 1st International Conference, Trento, Italy, June 6-8 (1998)

    Google Scholar 

  5. Ceri, S., Fraternali, P.: Designing Database Applications with Objects and Rules: The IDEA Methodology. Series on Database Systems and Applications. Addison-Wesley, Reading (1997)

    Google Scholar 

  6. Guarino, N., Masolo, C., Vetere, G.: Ontoseek: Using large linguistic ontologies for gathering information resources from the web. Technical report, LADSEB-CNR (1998)

    Google Scholar 

  7. Penarrubia, A., Fernandez-Caballero, A., Gonzalez, P., Botella, F., Grau, A., Martinez, O.: Ontology-based interface adaptivity in web-based learning systems. In: ICALT 2004: Proceedings of the IEEE International Conference on Advanced Learning Technologies (ICALT 2004), Washington, DC, USA, pp. 435–439. IEEE Computer Society, Los Alamitos (2004)

    Chapter  Google Scholar 

  8. Leacock, C., Chodorow, M.: Combining local context with wordnet similarity for word sense identification. In: Fellbaum, C. (ed.) WordNet: A Lexical Reference System and its Application. MIT Press, Cambridge (1998)

    Google Scholar 

  9. Choi, I., Kim, M.: Topic distillation using hierarchy concept tree. In: ACM SIGIR conference, pp. 371–372 (2003)

    Google Scholar 

  10. Zhong, J., Zhu, H., Li, J., Yu, Y.: Conceptual graph matching for semantic search. In: ICCS conference, pp. 92–196 (2002)

    Google Scholar 

  11. Resnik, P.: Using information content to evaluate semantic similarity in a taxonomy. In: IJCAI conference, pp. 448–453 (1995)

    Google Scholar 

  12. Resnik, P.: Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language. Journal of Artificial Intelligence Research 11, 95–130 (1999)

    Google Scholar 

  13. Lin, D.: An information-theoretic definition of similarity. In: ICML conference (1998)

    Google Scholar 

  14. Jiang, J., Conrath, D.W.: Semantic similarity based on corpus statistics and lexical taxonomy. CoRR cmp-lg/9709008 (1997)

    Google Scholar 

  15. Lord, P., Stevens, R., Brass, A., Goble, C.A.: Semantic similarity measures as tools for exploring the gene ontology. In: PSB conference (2003)

    Google Scholar 

  16. Schlicker, A., Domingues, F., Rahnenfuhrer, J., Lengauer, T.: A new measure for functional similarity of gene products based on gene ontology. BMC Bioinformatics 7, 302 (2006)

    Article  Google Scholar 

  17. Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Closed set based discovery of small covers for association rules. In: Actes des 15émes journées Bases de Données Avancées (BDA 1999), pp. 361–381 (1999)

    Google Scholar 

  18. Becker, H.S.: Sociological Work: Method and Substance. Transaction Publishers, U. S (1976)

    Google Scholar 

  19. De Leenheer, P., de Moor, A.: Context-driven disambiguation in ontology elicitation. In: Shvaiko, P., Euzenat, J. (eds.) Context and Ontologies: Theory, Practice and Applications, Pittsburgh, Pennsylvania, AAAI, pp. 17–24. AAAI Press, Menlo Park (2005)

    Google Scholar 

  20. Berka, P., Bruha, I.: Discretization and grouping: Preprocessing steps for data mining. In: Żytkow, J.M. (ed.) PKDD 1998. LNCS, vol. 1510, pp. 239–245. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  21. Srikant, R., Agrawal, R.: Mining generalized association rules. In: VLDB 1995: Proceedings of the 21th International Conference on Very Large Data Bases, pp. 407–419. Morgan Kaufmann Publishers Inc., San Francisco (1995)

    Google Scholar 

  22. Brisson, L.: Knowledge extraction using a conceptual information system (ExCIS). In: Collard, M. (ed.) ODBIS 2005/2006. LNCS, vol. 4623, pp. 119–134. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  23. Imieliński, T., Mannila, H.: A database perspective on knowledge discovery. Commun. ACM 39, 58–64 (1996)

    Article  Google Scholar 

  24. Rizzi, S., Bertino, E., Catania, B., Golfarelli, M., Halkidi, M., Terrovitis, M., Vassiliadis, P., Vazirgiannis, M., Vrachnos, E.: Towards a logical model for patterns. In: Song, I.-Y., Liddle, S.W., Ling, T.-W., Scheuermann, P. (eds.) ER 2003. LNCS, vol. 2813, pp. 77–90. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  25. Collard, M., Vansnick, J.C.: How to measure interestingness in data mining: a multiple criteria decision analysis approach. In: RCIS, pp. 395–400 (2007)

    Google Scholar 

  26. Rada, R., Mili, H., Bicknell, E., Blettner, M.: Development and application of a metric on semantic nets, vol. 19, pp. 17–30 (1989)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Brisson, L., Collard, M. (2009). How to Semantically Enhance a Data Mining Process?. In: Filipe, J., Cordeiro, J. (eds) Enterprise Information Systems. ICEIS 2008. Lecture Notes in Business Information Processing, vol 19. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00670-8_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-00670-8_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-00669-2

  • Online ISBN: 978-3-642-00670-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics