Abstract
An analysis of a huge amount of information is feasible only if information systems are used. First, information needs to be accumulated and stored in a persistent structure enabling effective data access and management. The main aspects of nowadays data processing are: storing data in (mostly relational) databases, improving data processing efficiency by parallel analysis [1], distributed processing (necessary for institution consisting of autonomous, geographically distributed departments), query languages (SQL) remain a fundamental way to access data in databases, data analysis often includes data mining (building data models describing data characteristics or predicting some features) [2].
Regarding the above mentioned circumstances authors propose an enhancement of SQL for data mining of a distributed data structure. Basic assumption is a complete, horizontal data fragmentation and an explicit model format. Building global data model consists of two stages. In the first one, local models are built in a parallel manner. Second one consists of combining these models into a global data picture. Detailed description of combining methods regarding global classification models authors presented in [3].
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Ullman, J.D., Widom, J.: A First Course in Database Systems. Prentice-Hall, Inc., Englewood Cliffs (1997)
Hand, D., Mannila, H., Smyth, P.: Principles of Data Mining. The MIT Press, Cambridge (2001)
Gorawski, M., Pluciennik, E.: Analytical Models Combining Methodology with Classification Model Example. In: 1st IEEE International Conference on Information Technology, Gdansk, Poland (2008)
International Organization for Standardization (ISO). Information Technology, Database Language, SQL Multimedia and Application Packages, Part 6: Data Mining Draft Standard No. ISO/IEC 13249-6 (2003)
Han, J., Fu, Y., Wang, W., Koperski, K., Zaiane, O.: DMQL: A Data Mining Query Language for Relational Database. In: Proc. Of SIGMOD Workshop DMKD, Montreal, Canada (1996)
Imieliński, T., Virmani, A.: MSQL: A Query Language for Database Mining. Data Mining and Knowledge Discovery (1999)
Meo, R., Psaila, G., Ceri, S.: An Extention to SQL for Mining Association Rules. Data Mining and Knowledge Discovery (1998)
Morzy, T., Zakrzewicz, M.: SQL-like language for database mining. In: Proc. of the First East-European, Symposium on Advances in Databases and Information Systems - ADBIS, St. Petersburg (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gorawski, M., Pluciennik, E. (2008). Distributed Data Mining by Means of SQL Enhancement. In: Meersman, R., Tari, Z., Herrero, P. (eds) On the Move to Meaningful Internet Systems: OTM 2008 Workshops. OTM 2008. Lecture Notes in Computer Science, vol 5333. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88875-8_16
Download citation
DOI: https://doi.org/10.1007/978-3-540-88875-8_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88874-1
Online ISBN: 978-3-540-88875-8
eBook Packages: Computer ScienceComputer Science (R0)