Skip to main content

Data Mining Tools: From Web to Grid Architectures

  • Conference paper
Advances in Grid Computing - EGC 2005 (EGC 2005)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3470))

Included in the following conference series:

Abstract

The paradigm of Grid computing is establishing as a novel, reliable and effective method to exploit a pool of hardware resources and make them available to the users. Data-mining benefits from the Grid as it often requires to run time consuming algorithms on large amounts of data which maybe reside on a different resource from the one having the proper data-mining algorithms. Also, in recent times, machine learning methods have been available to the purposes of knowledge discovery, which is a topic of interest for a large community of users. The present work is an account of the evolution of the ways in which a user can be provided with a data-mining service: from a web interface to a Grid service, the exploitation of a complex resource from a technical and a user-friendliness point of view is considered. More specifically, the goal is to show the interest/advantage of running data mining algorithm on the Grid. Such an environment can employ computational and storage resources in an efficient way, making it possible to open data mining services to Grid users and providing services to business contexts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Aizerman, M.A., Braverman, E.M., Rozoner, L.I.: Theoretical foundations of the potential function method in pattern recognition learning. Automation and Remote Control 25, 821–837 (1964)

    Google Scholar 

  2. Anguita, D., Ridella, S., Rivieccio, F., Zunino, R.: Hyperparameter design criteria for support vector classifiers. Neurocomputing 55(1-2), 109–134 (2003)

    Article  Google Scholar 

  3. Anguita, D., Boni, A., Sterpi, D., Ridella, S., Rivieccio, F.: Theoretical and Practical Model Selection Methods for Support Vector Classifiers. In: Wang, L. (ed.) Support Vector Machines: Theory and Applications. Springer, Heidelberg (in press)

    Google Scholar 

  4. Bartlett, P.L., Boucheron, S., Lugosi, G.: Model selection and error estimation. Mach. Learn. 48(1-3), 85–113 (2002)

    Article  MATH  Google Scholar 

  5. DeCoste, D., Wagstaff, K.: Alpha seeding for support vector machines. In: Proceedings of the International Conference on Knowledge Discovery and Data Mining, pp. 345–359 (2000)

    Google Scholar 

  6. Oracle Corp. Discover patterns, make predictions, develop advanced BI A pplications (January 2004)

    Google Scholar 

  7. Genton, M.G.: Classes of Kernels for Machine Learning: A Statistics Perspective. Journal of Machine Learning Research 2, 299–312 (2001)

    Article  Google Scholar 

  8. Cortes, C., Vapnik, V.: Support-vector networks. Machine Learning 20(3), 273–297 (1995)

    MATH  Google Scholar 

  9. Anguita, D., Boni, A., Ridella, S.: Evaluating the generalization ability of support vector machines through the bootstrap. Neural Processing Letters 11(1) (2000)

    Google Scholar 

  10. Anguita, D., Poggi, A., Scapolla, A.M.: Smart adaptive algorithm on the grid. In: 11th Plenary HP-OUVA Conference, Paris, June 2004, p. 2 (2004)

    Google Scholar 

  11. Efron, B., Tibshirani, R.: An Introduction to the Bootstrap. Chapman and Hall, London (1993)

    MATH  Google Scholar 

  12. Foster, I., Kesselman, C., Nick, J., Tuecke, S.: The physiology of the grid: An open grid services architecture for distributed systems integration (2002)

    Google Scholar 

  13. Guyon, I.: Online svm application list, available on web at: http://www.clopinet.com/isabelle/projects/svm/applist.html

  14. BeSC-Belfast e-Science Centre Online available on web at: http://www.qub.ac.uk/escience/projects/geddm/

  15. Czyzyk, J., Mesnier, M.P., Moré, J.J.: The NEOS Server. IEEE Computational Science and Engineering 5(3), 68–75 (1998)

    Article  Google Scholar 

  16. Globus-Project. GridFTP, Universal Data Transfer for the Grid. White Paper (September 2000)

    Google Scholar 

  17. Poggio, T., Smale, S.: The mathematics of learning: Dealing with data. Amer. Math. Soc. Notice 50(5), 537–544 (2003)

    MATH  MathSciNet  Google Scholar 

  18. IBM-Report. Web services resource framework (March 2004)

    Google Scholar 

  19. Vapnik, V.: Statistical Learning Theory. Wiley, Chichester (1998)

    MATH  Google Scholar 

  20. Foster, I., Frey, J., Graham, S., Tuecke, S., Czajkowski, K., Ferguson, D., Leymann, F., Nally, M., Sedukhin, I., Snelling, D., Storey, T., Vambenepe, W., Weerawarana, S.: Modeling Stateful Resources with Web Services, whitepaper available at: http://www-106.ibm.com/developerworks/library/ws-resource/ws-modelingresources.pdf

  21. Zanghirati, G., Zanni, L.: A Parallel Solver for Large Quadratic Programs in Training Support Vector Machines. Parallel Computing 29, 535–551 (2003)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Anguita, D., Poggi, A., Rivieccio, F., Scapolla, A.M. (2005). Data Mining Tools: From Web to Grid Architectures. In: Sloot, P.M.A., Hoekstra, A.G., Priol, T., Reinefeld, A., Bubak, M. (eds) Advances in Grid Computing - EGC 2005. EGC 2005. Lecture Notes in Computer Science, vol 3470. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11508380_63

Download citation

  • DOI: https://doi.org/10.1007/11508380_63

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-26918-2

  • Online ISBN: 978-3-540-32036-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics