Abstract
The paradigm of Grid computing is establishing as a novel, reliable and effective method to exploit a pool of hardware resources and make them available to the users. Data-mining benefits from the Grid as it often requires to run time consuming algorithms on large amounts of data which maybe reside on a different resource from the one having the proper data-mining algorithms. Also, in recent times, machine learning methods have been available to the purposes of knowledge discovery, which is a topic of interest for a large community of users. The present work is an account of the evolution of the ways in which a user can be provided with a data-mining service: from a web interface to a Grid service, the exploitation of a complex resource from a technical and a user-friendliness point of view is considered. More specifically, the goal is to show the interest/advantage of running data mining algorithm on the Grid. Such an environment can employ computational and storage resources in an efficient way, making it possible to open data mining services to Grid users and providing services to business contexts.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aizerman, M.A., Braverman, E.M., Rozoner, L.I.: Theoretical foundations of the potential function method in pattern recognition learning. Automation and Remote Control 25, 821–837 (1964)
Anguita, D., Ridella, S., Rivieccio, F., Zunino, R.: Hyperparameter design criteria for support vector classifiers. Neurocomputing 55(1-2), 109–134 (2003)
Anguita, D., Boni, A., Sterpi, D., Ridella, S., Rivieccio, F.: Theoretical and Practical Model Selection Methods for Support Vector Classifiers. In: Wang, L. (ed.) Support Vector Machines: Theory and Applications. Springer, Heidelberg (in press)
Bartlett, P.L., Boucheron, S., Lugosi, G.: Model selection and error estimation. Mach. Learn. 48(1-3), 85–113 (2002)
DeCoste, D., Wagstaff, K.: Alpha seeding for support vector machines. In: Proceedings of the International Conference on Knowledge Discovery and Data Mining, pp. 345–359 (2000)
Oracle Corp. Discover patterns, make predictions, develop advanced BI A pplications (January 2004)
Genton, M.G.: Classes of Kernels for Machine Learning: A Statistics Perspective. Journal of Machine Learning Research 2, 299–312 (2001)
Cortes, C., Vapnik, V.: Support-vector networks. Machine Learning 20(3), 273–297 (1995)
Anguita, D., Boni, A., Ridella, S.: Evaluating the generalization ability of support vector machines through the bootstrap. Neural Processing Letters 11(1) (2000)
Anguita, D., Poggi, A., Scapolla, A.M.: Smart adaptive algorithm on the grid. In: 11th Plenary HP-OUVA Conference, Paris, June 2004, p. 2 (2004)
Efron, B., Tibshirani, R.: An Introduction to the Bootstrap. Chapman and Hall, London (1993)
Foster, I., Kesselman, C., Nick, J., Tuecke, S.: The physiology of the grid: An open grid services architecture for distributed systems integration (2002)
Guyon, I.: Online svm application list, available on web at: http://www.clopinet.com/isabelle/projects/svm/applist.html
BeSC-Belfast e-Science Centre Online available on web at: http://www.qub.ac.uk/escience/projects/geddm/
Czyzyk, J., Mesnier, M.P., Moré, J.J.: The NEOS Server. IEEE Computational Science and Engineering 5(3), 68–75 (1998)
Globus-Project. GridFTP, Universal Data Transfer for the Grid. White Paper (September 2000)
Poggio, T., Smale, S.: The mathematics of learning: Dealing with data. Amer. Math. Soc. Notice 50(5), 537–544 (2003)
IBM-Report. Web services resource framework (March 2004)
Vapnik, V.: Statistical Learning Theory. Wiley, Chichester (1998)
Foster, I., Frey, J., Graham, S., Tuecke, S., Czajkowski, K., Ferguson, D., Leymann, F., Nally, M., Sedukhin, I., Snelling, D., Storey, T., Vambenepe, W., Weerawarana, S.: Modeling Stateful Resources with Web Services, whitepaper available at: http://www-106.ibm.com/developerworks/library/ws-resource/ws-modelingresources.pdf
Zanghirati, G., Zanni, L.: A Parallel Solver for Large Quadratic Programs in Training Support Vector Machines. Parallel Computing 29, 535–551 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Anguita, D., Poggi, A., Rivieccio, F., Scapolla, A.M. (2005). Data Mining Tools: From Web to Grid Architectures. In: Sloot, P.M.A., Hoekstra, A.G., Priol, T., Reinefeld, A., Bubak, M. (eds) Advances in Grid Computing - EGC 2005. EGC 2005. Lecture Notes in Computer Science, vol 3470. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11508380_63
Download citation
DOI: https://doi.org/10.1007/11508380_63
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26918-2
Online ISBN: 978-3-540-32036-4
eBook Packages: Computer ScienceComputer Science (R0)