Skip to main content
Log in

Knowledge discovery standards

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

As knowledge discovery (KD) matures and enters the mainstream, there is an onus on the technology developers to provide the technology in a deployable, embeddable form. This transition from a stand-alone technology, in the control of the knowledgeable few, to a widely accessible and usable technology will require the development of standards. These standards need to be designed to address various aspects of KD ranging from the actual process of applying the technology in a business environment, so as to make the process more transparent and repeatable, through to the representation of knowledge generated and the support for application developers. The large variety of data and model formats that researchers and practitioners have to deal with and the lack of procedural support in KD have prompted a number of standardization efforts in recent years, led by industry and supported by the KD community at large. This paper provides an overview of the most prominent of these standards and highlights how they relate to each other using some example applications of these standards.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Anand S, Büchner A (1998) Data mining for decision support. Financial Times Management

  • Bhandari I, Colet E, Parker J, Pines Z, Pratap R, Ramanujam K (1997) Advanced scout: data mining and knowledge discovery in NBA data. Data Mining Knowl Discov 1: 121–125

    Article  Google Scholar 

  • Blake C, Merz C (1998) UCI repository of machine learning databases. http://www.ics.uci.edu/~mlearn/MLRepository.html

  • Brachman R, Anand T (1994) The process of knowledge discovery in databases: a first sketch. In: Fayyad U, Uthurusamy R (eds) Knowledge discovery in databases: papers from the 1994 AAAI Workshop. Seattle, Washington, AAAI Press, pp 1–12

  • Chapman P, Clinton J, Kerber R, Khabaza T, Reinartz T, Shearer C, Wirth R (2000) Crisp-dm 1.0: Step-by-step data mining guide. http://www.crisp-dm.org

  • CWM (2004) Common warehouse metamodel (CWM). http://www.omg.org/technology/documents/formal/cwm.htm

  • DB2-IM (2004) DB2 intelligent miner for data. http://www-306.ibm.com/software/data/iminer/fordata/

  • Eisenberg A, Melton J (2001) SQL multimedia and application packages (SQL/MM). SIGMOD RECORD 30(4)

  • Farrand J, Flach P (2003) ROCOn: a tool for visualising ROC graphs. http://www.cs.bris.ac.uk/

  • Fayyad U, Piatetsky-Shapiro G, Smyth P (1996a) From data mining to knowledge discovery: an overview. In: Fayyad U, Piatetsky-Shapiro G, Smyth P, Uthurusamy R(eds) Advances in knowledge discovery and data mining. AAAI Press/The MIT Press, Menlo Park, CA, pp 1–34

    Google Scholar 

  • Fayyad U, Piatetsky-Shapiro G, Smyth P (1996b) The KDD process for extracting useful knowledge from volumes of data. Commun ACM 39(11): 27–34

    Article  Google Scholar 

  • Grossman R, Kamath C, Kegelmeyer P, V Kumar RN (2001) Data mining for scientific and engineering applications. Kluwer Academic Publishers

  • Grossman R, Hornick M, Meyer G (2002) Data mining standards initiatives. Commun ACM 45(8), http://www.dmg.org

  • ISO/IEC 9075:2003 (2003) ISO/IEC 9075:2003 Database Language SQL

  • ISO/IEC CD 13249-6 (2004) Information technology—database languages—SQL multimedia and application packages—Part 6: Data mining

  • JCA (2003) Java connection architecture. http://www.xmla.org/faq.asp

  • JCP (1995) The Java community process. http://jcp.org/en/home/index

  • JDM (2004) Java specification request 73. http://www.jcp.org/en/jsr/detail?id=73

  • Jorge A, Poças J, Azevedo P (2002) Post-processing operators for browsing large sets of association rules. In: Lange S, Satoh K, Smith C (eds) Proceedings of discovery science 02, Springer-Verlag, Lübeck, Germany, vol LNCS 2534

  • Klösgen W (1996) EXPLORA: a multipattern and multistrategy discovery assistant. In: Fayyad U, Piatetsky- Shapiro G, Smyth P, Uthurusamy R (eds) Advances in knowledge discovery and data mining. AAAI Press, Menlo Park, CA, pp 249–271

    Google Scholar 

  • Klösgen W, Zytkow J (eds) (2002) Knowledge discovery in databases: the purpose, necessity, and challenges, Handbook of data mining and knowledge discovery. Oxford University Press, Inc.

  • Li J (2003) PMML output and visualization for WEKA. PhD thesis, Department of Computer Science, University of Bristol. http://www.cs.bris.ac.uk/home/jl2092/project/thesis.pdf

  • OLE-DB (2000) OLE-DB for data mining specification 1.0. Microsoft. http://msdn2.microsoft.com/en-us/library/ms146608.aspx

  • Oracle (2004) Oracle data mining. http://www.oracle.com/technology/products/bi/odm/index.html

  • Provost F, Fawcett T (2001) Robust classification for imprecise environments. Mach Learn 42(3): 203–231

    Article  MATH  Google Scholar 

  • Pyle D (2004) Nine simple rules you won’t want to follow. DB2 magazine 9(1)

  • Raedt LD (2002) A perspective on inductive databases. ACM SIGKDD Explor Newslett 4(2)

  • SOAP (2004) Simple object access protocol (SOAP). http://www.w3.org/TR/SOAP/

  • Tang Z, Kim P (2004) Building data mining solutions with SQL Server 2000. http://www.dmreview.com/whitepaper/wid292.pdf

  • Wettschereck D, Jorge A, Moyle S (2003) Data mining and decision support integration through the predictive model markup language standard and visualization. In: Data mining and decision support: integration and collaboration. Kluwer

  • Witten I, Frank E (1999) Data mining: practical machine learning tools with Java implementations. Morgan Kaufmann, San Francisco

    Google Scholar 

  • XMLA (2004) XML for analysis (XMLA). http://www.xmla.org/

  • XMLA-spec (2004) XML for analysis (XMLA) specification version 1.1. http://www.xmla.org/docs_pub.asp

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sarabjot Singh Anand.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Anand, S.S., Grobelnik, M., Herrmann, F. et al. Knowledge discovery standards. Artif Intell Rev 27, 21–56 (2007). https://doi.org/10.1007/s10462-008-9067-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-008-9067-4

Keywords

Navigation