Abstract
Inductive databases (IDBs) represent a database perspective on Knowledge discovery in databases (KDD). In an IDB, the KDD application can express both queries capable of accessing and manipulating data, and queries capable of generating, manipulating, and applying patterns allowing to formalize the notion of mining process. The feature that makes them different from other data mining applications is exactly the idea of looking at the support for knowledge discovery as an extension of the query process. This paper draws a list of desirable properties to be taken into account in the definition of an IDB framework. They involve several dimensions, such as the expressiveness of the language in representing data and models, the closure principle, the capability to provide a support for an efficient algorithm programming. These requirements are a basis for a comparative study that highlights strengths and weaknesses of existing IDB approaches. The paper focuses on the SQL-based ATLaS language/system, on the logic-based \({\mathcal{LDL}++}\) language/system, and on the XML-based KDDML language/system.
Similar content being viewed by others
References
Agrawal R, Methta M, Shafer J, Srikant R (1994) Fast Algorithms for Mining Association Rules in Large Databases. In Jorge B, Bocca JB, Jarke M, Zaniolo C (eds) Proceeding of the 20th International Conference on very large databases (VLDB), Santiago de Chile, Chile, 12–15 September, pp 478–499
Arni N, Ong K, Zaniolo C (1993) Negation and Aggregates in Recursive Rules: the \({\mathcal{LDL}++}\) Approach. In Ceri S, Tanaka K, Tsur S (eds) Proceeding of the Third International Conference on deductive and object-oriented databases (DOOD), Phoenix, Arizona, USA, 6–8 December, pp 204–221
Baglioni M, Ferrara U, Romei A, Ruggieri R, Turini F (2003) Preprocessing and Mining Web Log Data for Web Personalization. In Cappelli A, Turini F (eds) Proceeding of the 8th Congress of the Italian Association for artificial intelligence (AI*IA), Pisa, Italy, 23–26 September, pp 237–249
Baralis E, Cerquitelli T, Chiusano S (2005) Index support for frequent itemset mining in a relational DBMS. In: Proceedings of the 21st International Conference on data engineering (ICDE 2005), IEEE Computer Society, Tokyo, Japan, 5–8 April, pp 754–765
Baralis E, Garza P, Quintarelli E, Tanca L (2007) Answering XML queries by means of data summaries. ACM Trans Info Syst 25(3): 1–10
Bentayeb F, Darmont J (2002) Decision tree modeling with relational views. In: Mohand-Said Hacid MS, Ras ZW, Zighed DA, Kodratoff Y (eds) Proceedings of the 13th International Symposium on foundations of intelligent systems (ISMIS), Lyon, France, 27–29 June, pp 423–431
Bernhardt J, Chaudhuri S, Fayyad UM, Netz A (2001) Integrating data mining with SQL databases: OLE DB for Data Mining. In: Proceedings of the 17th International Conference on data engineering (ICDE), IEEE Computer Society, Heidelberg, Germany, 2–6 April, pp 379–387
Bertino E, Catania B, Maddalena A, Skiadopoulos S, Terrovitis M, Vassiliadis P (2004) Modeling and Language Support for the Management of Pattern-Bases. In: Proceedings of the 16th International Conference on scientific and statistical database management (SSDBM), IEEE Computer Society. Santorini Island, Greece, 21–23 June, pp 265–274
Cao B, Badia A (2009) Exploiting maximal redundancy to optimize SQL queries. Knowl Info Syst 20(2): 187–220
Blockeel H, Calders T, Fromont É, Goethals B, Prado A (2008) Mining views: database views for data mining. In: IEEE Computer Society. Proceedings of the 24th International Conference on data engineering (ICDE), Cancún, México, 7–12 April, pp 1608–1611
Blockeel H, Calders T, Fromont É, Goethals B, Prado A, Robardet C (2008) An inductive database prototype based on virtual mining views. In Li Y, Liu B, Sarawagi S (eds) Proceedings of the 14th ACM SIGKDD International Conference on knowledge discovery and data mining (KDD), Las Vegas, Nevada, USA, 24–27 August, pp 1061–1064
Botta M, Boulicaut JF, Masson C, Meo R (2004) Query languages supporting descriptive rule mining: a comparative study. In: Meo R, Lanzi PL, Klemettinen M (eds) Database support for data mining applications: discovering knowledge with inductive queries, pp 24–51
Braga D, Campi A, Ceri S, Klemettinen M, Lanzi PL (2003) Discovering interesting information in XML data with association rules. In: Proceedings of the 2003 ACM Symposium on applied computing (SAC), Melbourne, FL, 9–12 March, pp 450–454
Ceri S, Meo R, Psaila G (1998) An extension to SQL for mining association rules. Data Min Knowl Discov 2(2): 195–224
De Raedt L (2000) A Logical Database Mining Query Language. In: Cussens J, Frisch A (eds) Proceedings of the 10th International Conference on inductive logic programming (ILP), London, UK, 24–27 July, pp 78–92
Euler T, Klinkenberg R, Mierswa I, Scholz M, Wurst M (2006) YALE: rapid prototyping for complex data mining tasks. In: Ungar LH, Craven M, Gunopulos D, Eliassi-Rad T (eds) Proceedings of the 12th ACM SIGKDD International Conference on knowledge discovery and data mining (KDD), Philadelphia, USA, 20–23 August, pp 935–940
Flach PA (1998) From extensional to intensional knowledge: inductive logic programming techniques and their application to deductive databases. In: Freitag B, Decker H, Kifer M, Voronkov A (eds) Transactions and change in logic databases. Springer, Berlin, pp 356–387
Fromont É, Blockeel H, Struyf J (2006) Integrating decision tree learning into inductive databases. In: Dzeroski S, Struyf J (eds) Proceedings of the 5th International Workshop on knowledge discovery in inductive databases (KDID), Springer, Berlin, 18 September, pp 81–96
Fu Y, Han J, Koperski K, Wang W, Zaiane O (1996) DMQL: a data mining query language for relational databases. In: Proceedings of the SIGMOD Workshop on research issues in data mining and knowledge discovery (DMKD), Montreal, Canada, pp 122–133
Giannotti F, Manco G, Pedreschi D, Turini F (1999) Experiences with a logic-based knowledge discovery support environment. In: Lamma E, Mello P (eds) Proceedings of the 6th Congress of the Italian Association for artificial intelligence (AI*IA), Bologna, Italy, 14–17 September, pp 202–213
Giannotti F, Manco G, Turini F (2004) Specifying mining algorithms with iterative user-defined aggregates. IEEE Trans Knowl Data Eng 16(10): 1232–1246
Giannotti F, Manco G, Turini F (2004) Towards a Logic Query Language for Data Mining. In: Meo R, Lanzi PL, Klemettinen M (eds) Database support for data mining applications. Springer, Berlin, pp 76–94
Giannotti F, Manco G, Wijsen J (2003) Logical languages for data mining. Chomicki J, Meyden RVD, Saake G (eds) Logics for emerging applications of databases, Springer, Berlin, Germany, pp 325–368
Grossi V, Romei A (2008) Extending KDDML with a visual metaphor for the KDD process. In: Sebillo M, Vitiello G, Schaefer G (eds) Proceedings of the 10th International Conference on visual information systems (VISUAL), Salerno, Italy, 11–12 September, pp 138–149
IBM (2008) Intelligent Miner 8.2. http://www-01.ibm.com/software/data/iminer/
Imielinski T, Mannila H (1996) A database perspective on knowledge discovery. Commun ACM 39(11): 58–64
Imielinski T, Virmani A (1999) MSQL: a query language for database mining. Data Min Knowl Discov 2(4): 373–408
Law YN, Wang H, Zaniolo C (2004) Query languages and data models for database sequences and data streams. In: Nascimento MA, Ozsu MT, Kossmann D, Miller RJ, Blakeley JA, Schiefer KB (eds) Proceedings of the 30th International Conference on very large data bases (VLDB), Toronto, Canada, 31 August–3 September, pp 492–503
Lisi FA (2003) Data mining in hybrid languages via ILP. In: Calvanese D, Giacomo GD, Franconi E (eds) Proceedings of the International Workshop on description logics (DL’03), Rome, Italy, 5–7 September
Liu HC, Yu JX, Zeleznikow J, Guan Y (2007) A logic-based approach to mining inductive databases. In: Shi Y, Dongarra J, Albada GDV (eds) Proceedings of the 7th International Conference on computational science (ICCS), Beijing, China, 27–30 May, pp 270–277
Meo R, Psaila G (2006) An XML-based database for knowledge discovery. In: Grust T, Höpfner H, Illarramendi A, Jablonski S, Mesiti M, Müller S, Patranjan PL, Sattler KU, Spiliopoulou M, Wijsen J (eds) Current trends in database technology (EDBT 2006 Workshops PhD), Springer, Munich, pp 814–828
Morik K (1997) Knowledge Discovery in databases—an inductive logic programming approach. In: Freksa C, Jantzen M, Valk R (eds) Foundations of computer science: potential-theory-cognition, to Wilfried Brauer on the occasion of his sixtieth birthday, Springer, London, pp 429–436
Morzy T, Zakrzewicz M (1997) SQL-like language for database mining. In: Proceedings of the First East-European Symposium on advances in databases and information systems (ADBIS), Nevsky Dialect, St. Petersburg, Russia, 2–5 September, pp 331–317
Ng PKL, Ng VTY (2008) RRSi: indexing XML data for proximity twig queries. Knowl Inf Syst 17(2): 193–216
Nijssen S, Raedt LD (2006) IQL: a proposal for an inductive query language. In: Dzeroski S, Struyf J (eds) Proceedings of the 5th International Workshop on knowledge discovery in inductive databases (KDID), Berlin, Germany, 18 September, pp 189–207
Orlando S, Palmerini P, Perego R, Silvestri F (2002) Adaptive and resources-aware mining of frequent sets. In: Proceedings of the 2002 IEEE International Conference on data mining (ICDM), IEEE Computer Society, Maebashi City, Japan, 9–12 December, pp 338–345
Richter L, Wicker J, Kessler K, Kramer S (2008) An inductive database and query language in the relational model. In: Kemper A, Valduriez P, Mouaddib N, Teubner J, Bouzeghoub M, Markl V, Amsaleg L, Manolescu I (eds) Proceedings of the 11th International Conference on extending database technology (EDBT), Nantes, France, 25–29 March, pp 740–744
Romei A, Ruggieri S, Turini F (2006) KDDML: a middleware language and system for knowledge discovery in databases. Data Knowl Eng 57(2): 179–220
Romei A, Sciolla M, Turini F, Valentini M (2007) KDDML-G: a grid-enabled knowledge discovery system. Concurr Comput Pract Experience 19(13): 1785–1809
Sarawagi S, Thomas S, Agrawal R (2000) Integrating association rule mining with relational database systems: alternatives and implications. Data Min Knowl Discov 4(2/3): 89–125
SPSS (2007) CRISP-DM step by step data mining guide. Version 2.0 (January 2007). http://www.crisp-dm.org
SPSS (2008) Clementine 12.0. http://www.spss.com/clementine
The Data Mining Group (2009) The Predictive Model Markup Language (PMML). Version 4.0. http://www.dmg.org/pmml-v4-0.html
The Object Management Group (2003) The common warehouse metamodel (CWM). Version 1.1. http://www.omg.org/
Wang H, Zaniolo C (2000) Nonmonotonic reasoning in \({\mathcal{LDL}++}\). In: Minker J (eds) Logic-based artificial intelligence. Kluwer, Dordrecht, pp 523–544
Wang H, Zaniolo C (2003) ATLaS: a native extension of SQL for data mining. In: Barbará D, Kamath C (eds) Proceedings of the Third SIAM International Conference on data mining, San Francisco, CA, USA, 1–3 May, pp 130–141
Wicker J, Richter L, Kessler K, Kramer S (2008) SINDBAD and SiQL: an inductive database and query language in the relational model. In: Daelemans W, Goethals B, Morik K (eds) Proceedings of the European Conference on machine learning and knowledge discovery in databases (ECML/PKDD), Antwerp, Belgium, 15–19 September, pp 690–694
Wu X, Kumar V, Quinlan JR, Ghosh J, Yang Q, Motoda H, McLachlan GJ, Ng AFM, Liu B, Yu PS, Zhou ZH, Steinbach M, Hand DJ, Steinberg D (2008) Top 10 algorithms in data mining. Knowl Info Syst 14(1): 1–37
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Romei, A., Turini, F. Inductive database languages: requirements and examples. Knowl Inf Syst 26, 351–384 (2011). https://doi.org/10.1007/s10115-009-0281-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-009-0281-4