Abstract
Data mining often is a compute intensive and time requiring process. For this reason, several data mining systems have been implemented on parallel computing platforms to achieve high performance in the analysis of large data sets. Moreover, when large data repositories are coupled with geographical distribution of data, users and systems, more sophisticated technologies are needed to implement high-performance distributed KDD systems. Recently computational Grids emerged as privileged platforms for distributed computing and a growing number of Grid-based KDD systems have been designed. In this paper we first outline different ways to exploit parallelism in the main data mining techniques and algorithms, then we discuss Grid-based KDD systems.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Cohen, W.W.: Fast Effective Rule Induction. In: Proc. of the 12th Int. Conf. Machine Learning (ICML’95), Tahoe City, California, USA, pp. 115–123 (1995)
Provost, F.J., Aronis, J.M.: Scaling up inductive learning with massive parallelism. International Journal of Machine Learning 23(1), 33–46 (1996)
Skillicorn, D.: Strategies for Parallel Data Mining. IEEE Concurrency 7(4), 26–35 (1999)
Talia, D.: Parallelism in Knowledge Discovery Techniques. In: Fagerholm, J., et al. (eds.) PARA 2002. LNCS, vol. 2367, pp. 127–136. Springer, Heidelberg (2002)
Foster, I., et al.: The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Integration. Globus Project (2002), http://www.globus.org/alliance/publications/papers/ogsa.pdf
Congiusta, A., Talia, D., Trunfio, P.: Parallel and Grid-Based Data Mining. In: Data Mining and Knowledge Discovery Handbook, pp. 1017–1041. Springer, Heidelberg (2005)
Pawlak, Z.: Rough Sets. International Journal of Computer and Information Science 11, 341–356 (1982)
Düntsch, I., Günther, G.: Roughian: Rough information analysis. International Journal of Intelligent Systems 16(1), 121–147 (2001)
Skowron, A., Rauszer, C.: The Discernibility Matrices and Functions in Information Systems. In: Intelligent Decision Support: Handbook of Applications and Advances of the Rough Sets Theory, Kluwer Academic Publishers, Dordrecht (1992)
Park, B., Kargupta, H.: Distributed Data Mining: Algorithms, Systems, and Applications. In: Data Mining Handbook, pp. 341–358. IEA Publisher, Amsterdam (2002)
Moore, R.: Knowledge-based Grids. In: Proc. of the 18th IEEE Symposium on Mass Storage Systems and 9th Goddard Conference on Mass Storage Systems and Technologies, San Diego, USA (2001)
Berman, F.: From TeraGrid to Knowledge Grid. Communications of the ACM 44(11), 27–28 (2001)
Johnston, W.E.: Computational and Data Grids in Large Scale Science and Engineering. Future Generation Computer Systems 18(8), 1085–1100 (2002)
Talia, D., Cannataro, M., Trunfio, P.: KNOWLEDGE GRID: High Performance Knowledge Discovery Services on the Grid. In: Lee, C.A. (ed.) GRID 2001. LNCS, vol. 2242, Springer, Heidelberg (2001)
Cannataro, M., Talia, D.: The Knowledge Grid. Communications of the ACM 46(1), 89–93 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cesario, E., Talia, D. (2007). From Parallel Data Mining to Grid-Enabled Distributed Knowledge Discovery. In: An, A., Stefanowski, J., Ramanna, S., Butz, C.J., Pedrycz, W., Wang, G. (eds) Rough Sets, Fuzzy Sets, Data Mining and Granular Computing. RSFDGrC 2007. Lecture Notes in Computer Science(), vol 4482. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72530-5_3
Download citation
DOI: https://doi.org/10.1007/978-3-540-72530-5_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72529-9
Online ISBN: 978-3-540-72530-5
eBook Packages: Computer ScienceComputer Science (R0)