Skip to main content

Clusters and Grids for Distributed and Parallel Knowledge Discovery

  • Conference paper
  • First Online:
Book cover High Performance Computing and Networking (HPCN-Europe 2000)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1823))

Included in the following conference series:

Abstract

Parallel and Distributed Knowledge Discovery (PDKD) is emerging as a possible killer application for clusters and grids of computers. The need to process large volumes of data and the availability of parallel data mining algorithms, makes it possible to exploit the increasing computational power of clusters at low costs. On the other side, grid computing is an emerging “standard” to develop and deploy distributed, high performance applications over geographic networks, in different domains, and in particular for data intensive applications. This paper proposes an approach to integrate cluster of computers within a grid infrastructure to use them, enriched by specific data mining services, as the deployment platform for high performance distributed data mining and knowledge discovery.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. G. Piatesky-Shapiro, The data mining Industry coming of age, IEEE Intelligent Systems, pp. 32–34, november/december 1999

    Google Scholar 

  2. A. Freitas, S. Levington, Mining Very Large Databases with Parallel Processing, Kluwer, 1998.

    Google Scholar 

  3. M.J.A. Michael, J.A. Berry, Data Mining Techniques, John Wiley & Sons, 1997.

    Google Scholar 

  4. D. Abramson, From PC Clusters to a Global Computational Grid, 1st IEEE Workshop on Cluster Computing (IWCC99), Melbourne, 1999.

    Google Scholar 

  5. R. Moore, Collection-Based Data Management, Workshop on Large-Scale Parallel, KDD Systems (KDD99), San Diego, CA, 1999.

    Google Scholar 

  6. S. Bailey, E. Creel, R. Grossman, S. Gutti, H. Sivakumar, A high performance implementation of the data space transfer protocol (DSTP), Workshop on Large-Scale Parallel, KDD Systems (KDD99), San Diego, CA, 1999.

    Google Scholar 

  7. U. Dayal, Large-Scale Data Mining Applications: Requirements and Architectures, Workshop on Large-Scale Parallel KDD Systems (KDD99), San Diego, CA, 1999.

    Google Scholar 

  8. G. Williams, Integrated Delivery of Large-Scale Data Mining Systems, Workshop on Large-Scale Parallel KDD Systems (KDD99), San Diego, CA, 1999.

    Google Scholar 

  9. R. Grossman, S. Kasif, R. Moore, D. Rocke, J. Ullman, Data Mining Research: Opportunities and Challenges, A report on three NFS Workshops on Mining Large, Massive and Distributed Data, available at http://www.ncdm.uic.edu/m3d-finalreport.htm

  10. B. Grossman and Yike Guo, Communicating Data Mining: Issues and Challenges in Wide Area Distributed Data Mining, Workshop on Large-Scale Parallel KDD Systems (KDD99), San Diego, CA, 1999.

    Google Scholar 

  11. V. Kumar, Large-Scale Data Mining: Where is it Headed?, Workshop on Large-Scale Parallel KDD Systems (KDD99), San Diego, CA, 1999.

    Google Scholar 

  12. Building the Grid: An Integrated Services and Toolkit Architecture for Next-Generation Networked Applications, Working Draft, http://www.gridforum.org/building_the_grid.htm.

  13. Foster and C. Kesselman, editors, The Grid: Blueprint for a New Computing Infrastructure, Morgan Kaufmann Publishers, 1999.

    Google Scholar 

  14. Foster, G. H. Thiruvathukal, S. Tuecke, Technologies for Ubiquitous Supercomputing: A Java Interface to the Nexus Communication System, Concurrency: Practice and Experience, special issue edited by G. C. Fox, June 1997.

    Google Scholar 

  15. The Globus project, available at http://www.globus.org.

  16. The Nimrod project, available at http://www.dgs.monah.edu/~davida/nimrod.html.

  17. Rajkumar Buyya (editor), High Performance Cluster Computing: Architectures and Systems, Prentice Hall PTR, NJ, USA, 1999.

    Google Scholar 

  18. M. Baker, editor, Cluster Computing White Paper, http://www.dcs.port.ac.uk/~mab/tfcc/WhitePaper/

  19. R. L. Grossman, S. Kasif, D. Mon, A. Ramu and B. Malhi, The Preliminary Design of Papyrus: A System for High Performance, Distributed Data Mining over Clusters, Meta-Clusters and Super-Clusters, Proceedings of the KDD-98 Workshop on Distributed Data Mining, AAAI, 1999.

    Google Scholar 

  20. S. Stolfo, A. L. Prodromis, P.K. Chan, JAM: Java Agents for Meta-Learning over Distributed Databases, Proc. of the 3rd Int. Conf. On Knowledge Discovery and data Miing, AAAI Press, CA, 1997.

    Google Scholar 

  21. Y. Guo et al., Meta Learning for parallel Data Mining, in Proc. o the 7th Parallel Computing Workshop, 1997.

    Google Scholar 

  22. Albanese, M. Cannataro, P. Rullo, D. Saccà, Transmitting Datacubes over Congested Networks, Proc. of the IEEE International Conference on Coding and Transmission (ITCC2000), Las Vegas, 2000 (to appear).

    Google Scholar 

  23. Foster, I., A Grid-Enabled MPI: Message Passing in Heterogeneous Distributed Computing Systems, Proc. of the SC98 Conference, Orlando, USA, Nov. 7–13, 1998.

    Google Scholar 

  24. DiNucci, D. “The Role and Requirements of a Grid Programming Model”, available at http://www.elepar.com/GPMWG/gpm.1.ps

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cannataro, M. (2000). Clusters and Grids for Distributed and Parallel Knowledge Discovery. In: Bubak, M., Afsarmanesh, H., Hertzberger, B., Williams, R. (eds) High Performance Computing and Networking. HPCN-Europe 2000. Lecture Notes in Computer Science, vol 1823. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45492-6_86

Download citation

  • DOI: https://doi.org/10.1007/3-540-45492-6_86

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-67553-2

  • Online ISBN: 978-3-540-45492-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics