Skip to main content

Probing Knowledge in Distributed Data Mining

  • Conference paper
  • First Online:
Methodologies for Knowledge Discovery and Data Mining (PAKDD 1999)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1574))

Included in the following conference series:

Abstract

In this paper, we propose a new approach to apply meta-learning concept to distributed data mining. We name this approach Knowledge Probing where a supervised learning process is organised into two learning stages. In the first learning phase, a set of base classifiers are learned in parallel from a distributed data set. In the second learning phase, meta-learning is applied to induce the relationship between an attribute vector and the class predictions from all the base classifiers. By applying this approach to an environment where base classifiers are produced from distributed data sources, the output of Knowledge Probing process can be viewed as the assimilated knowledge of that distributed learning system. Some initial experimental results on the quality of the assimilated knowledge are presented. We believe that an integration of Knowledge Probing technique and the available data mining algorithms can provide a practical framework for distributed data mining applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. K. M. Ali and M. J. Pazzani. Error reduction through learning multiple descriptions. Machine Learning, 24(3):173–202, September 1996.

    Google Scholar 

  2. E. Bauer and R. Kohavi. An empirical comparison of voting classification algorithms: Bagging, boosting and variants. Machine Learning, Submitted:1–33, 1998.

    Google Scholar 

  3. L. Breiman. Heuristics of instability in model selection. Technical report, Statistics Department, University of California at Berkeley, California, 1994.

    Google Scholar 

  4. L. Breiman. Bagging predictors. Machine Learning, 24:123–140, 1996.

    MATH  MathSciNet  Google Scholar 

  5. P. Chan and S. Stolfo. Meta-learning for multistrategy and parallel learning. In Proceeding of the Second International Work on Multistrategy Learning, pages 150–165, 1993.

    Google Scholar 

  6. P. Chan and S. Stolfo. Toward parallel and distributed learning by meta-learning. In In Working Notes AAAI Work. Knowledge Discovery in Databases, pages 227–240. AAAI, 1993.

    Google Scholar 

  7. P. Chan and S. Stolfo. On the accuracy of meta-learning for scalable data mining. Journal of Intelligent Information System, 8:5–28, 1996.

    Article  Google Scholar 

  8. Y. Freund and R. E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. In Proceedings of the Second European Conference on Computational Learning Theory, pages 23–37. Springer Verlag, 1995.

    Google Scholar 

  9. Y. Guo, S. Rüeger, J. Sutiwaraphun, and J. Forbes-Millott. Meta-learning for parallel data mining. In Proceedings of the Seventh Parallel Computing Workshop, pages 1–2. Fujitsu Laboratories Ltd., November 1997.

    Google Scholar 

  10. Y. Guo and J. Sutiwaraphun. Knowledge probing in distributed data mining. Technical report, Department of Computing, Imperial College, September 1998.

    Google Scholar 

  11. L. K. Hansen and P. Salamon. Neural network ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(10):993–1001, 1990.

    Article  Google Scholar 

  12. R. Kohavi, D. Sommerfield, and J. Dougherty. Data mining using MLC++: A machine learning library in C++. In Tools With Artificial Intelligence 1996, pages 234–245. IEEE Computer Society Press, November 1996. http://www.sgi.com/Technology/mlc.

  13. R. Kohavi and D. Wolpert. Bias plus variance decomposition for zero-one loss functions. In L. Saitta, editor, Machine Learning: Proceedings of the Thirteenth International Conference, pages 275–283. Morgan Kaufmann, 1996.

    Google Scholar 

  14. C. J. Merz and P. M. Murphy. UCI repository of machine learning databases. University of California, Department of Information and Computer Science, http://www.ics.uci.edu/~mlearn/MLRepository.html, 1996.

  15. J. Sutiwaraphun. Investigating into distributed data mining. Technical report, Department of Computing, Imperial College, May 1998.

    Google Scholar 

  16. K. Yamanishi. Distributed cooperative bayesian learning strategies. In Proceedings of the 1997 10th Annual Conference on Computational Learning Theory, pages 250–262, Nashville, TN, July 1997. ACM, New York.

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Guo, Y., Sutiwaraphun, J. (1999). Probing Knowledge in Distributed Data Mining. In: Zhong, N., Zhou, L. (eds) Methodologies for Knowledge Discovery and Data Mining. PAKDD 1999. Lecture Notes in Computer Science(), vol 1574. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48912-6_59

Download citation

  • DOI: https://doi.org/10.1007/3-540-48912-6_59

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-65866-5

  • Online ISBN: 978-3-540-48912-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics