Skip to main content

On the Parallelization of the Sparse Grid Approach for Data Mining

  • Conference paper
  • First Online:
Large-Scale Scientific Computing (LSSC 2001)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2179))

Included in the following conference series:

Abstract

Recently we presented a new approach [5, 6] to the classification problem arising in data mining. It is based on the regularization network approach, but in contrast to other methods which employ ansatz functions associated to data points, we use basis functions coming from a grid in the usually high-dimensional feature space for the minimization process. Here, to cope with the curse of dimensionality, we employ so-called sparse grids. To be precise we use the sparse grid combination technique [11] where the classification problem is discretized and solved on a sequence of conventional grids with uniform mesh sizes in each dimension. The sparse grid solution is then obtained by linear combination. The method scales only linearly with the number of data points and is well suited for data mining applications where the amount of data is very large, but where the dimension of the feature space is moderately high. The computation on each grid of the sequence of grids is independent of each other and therefore can be done in parallel already on a coarse grain level. A second level of parallelization on a fine grain level can be introduced on each grid through the use of threading on shared-memory multi-processor computers.

We describe the sparse grid combination technique for the classification problem, we discuss the two ways of parallelisation, and we report on the results on a 10 dimensional data set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. R. Balder. Adaptive Verfahren für elliptische und parabolische Differentialgleichungen auf dÜnnen Gittern, Dissertation, Technische Universität MÜnchen, 1994.

    Google Scholar 

  2. M. J. A. Berry and G. S. Linoff. Mastering Data Mining, Wiley, 2000.

    Google Scholar 

  3. H.-J. Bungartz. DÜnne Gitter und deren Anwendung bei der adaptiven LÖsung der dreidimensionalen Poisson-Gleichung, Dissertation, Institut für Informatik, Technische Universität MÜnchen, 1992.

    Google Scholar 

  4. K. Cios, W. Pedrycz, and R. Swiniarski. Data Mining Methods for Knowledge Discovery, Kluwer, 1998.

    Google Scholar 

  5. J. Garcke and M. Griebel. Data mining with sparse grids using simplicial basis functions, in Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2001, also as SFB 256 Preprint 713, Universität Bonn, 2001.

    Google Scholar 

  6. J. Garcke, M. Griebel, and M. Thess. Data mining with sparse grids, Computing, 2001, (to appear), also as SFB 256 Preprint 675, Institut für Angewandte Mathematik, Universität Bonn, 2000.

    Google Scholar 

  7. F. Girosi. An equivalence between sparse approximation and support vector machines, Neural Computation, 10(6), 1455–1480, 1998.

    Article  Google Scholar 

  8. F. Girosi, M. Jones, and T. Poggio. Regularization theory and neural networks architectures, Neural Computation, 7, 219–265, 1995.

    Article  Google Scholar 

  9. M. Griebel. The combination technique for the sparse grid solution of PDEs on multiprocessor machines, Parallel Processing Letters, 2(1), 61–70, 1992, also as SFB Bericht 342/14/91 A, Institut für Informatik, TU MÜnchen, 1991.

    Article  Google Scholar 

  10. M. Griebel, W. Huber, T. StÖrtkuhl, and C. Zenger. On the parallel solution of 3D PDEs on a network of workstations and on vector computers, in A. Bode and M. Dal Cin, (eds.), Parallel Computer Architectures: Theory, Hardware, Software, Applications, Lecture Notes in Computer Science, 732, Springer Verlag, 276–291, 1993.

    Chapter  Google Scholar 

  11. M. Griebel, M. Schneider, and C. Zenger. A combination technique for the solution of sparse grid problems, in P. de Groen and R. Beauwens, (eds.), Iterative Methods in Linear Algebra, IMACS, Elsevier, North Holland, 263–281, 1992, also as SFB Bericht, 342/19/90 A, Institut für Informatik, TU MÜnchen, 1990.

    Google Scholar 

  12. G. Melli. Datgen: A program that creates structured data. Website. http://www.datasetgenerator.com.

  13. A. N. Tikhonov and V. A. Arsenin. Solutions of ill-posed problems, W.H. Winston, Washington D.C., 1977.

    MATH  Google Scholar 

  14. G. Wahba. Spline models for observational data, Series in Applied Mathematics, 59, SIAM, Philadelphia, 1990.

    Book  Google Scholar 

  15. C. Zenger. Sparse grids, in W. Hackbusch, (ed.), Parallel Algorithms for Partial Differential Equations, Proceedings of the Sixth GAMM-Seminar, Kiel, 1990, Notes on Num. Fluid Mech., 31, Vieweg-Verlag, 1991.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Garcke, J., Griebel, M. (2001). On the Parallelization of the Sparse Grid Approach for Data Mining. In: Margenov, S., Waśniewski, J., Yalamov, P. (eds) Large-Scale Scientific Computing. LSSC 2001. Lecture Notes in Computer Science, vol 2179. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45346-6_2

Download citation

  • DOI: https://doi.org/10.1007/3-540-45346-6_2

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-43043-8

  • Online ISBN: 978-3-540-45346-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics