Skip to main content

Improved Dataset Characterisation for Meta-learning

  • Conference paper
  • First Online:
Book cover Discovery Science (DS 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2534))

Included in the following conference series:

Abstract

This paper presents new measures, based on the induced decision tree, to characterise datasets for meta-learning in order to select appropriate learning algorithms. The main idea is to capture the characteristics of dataset from the structural shape and size of decision tree induced from the dataset. Totally 15 measures are proposed to describe the structure of a decision tree. Their effectiveness is illustrated through extensive experiments, by comparing to the results obtained by the existing data characteristics techniques, including data characteristics tool (DCT) that is the most wide used technique in metalearning, and Landmarking that is the most recently developed method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. C. E. Brodley. Recursive automatic bias selection for classifier construction. Machine Learning, 20:63–94, 1995.

    Google Scholar 

  2. H. Bensusan, and C. Giraud-Carrier. Discovering Task Neighbourhoods through Landmark Learning Performances. In Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery. 325–330, 2000.

    Google Scholar 

  3. H. Bensusan, C. Giraud-Carrier, and C. Kennedy. Higher-order Approach to Metalearning. The ECML’2000 workshop on Meta-Learning: Building Automatic Advice Strategies for Model Selection and Method Combination, 109–117, 2000.

    Google Scholar 

  4. C. Blake, E. Keogh, C. Merz. http://www.ics.uci.edu/~mlearn/mlrepository.html. University of California, Irvine, Dept. of Information and Computer Sciences,1998.

  5. P. Brazdil, J. Gama, and R. Henery. Characterizing the Applicability of Classification Algorithms using Meta Level Learning. In Proceeedings of the European Conference on Machine Learning, ECML-94, 83–102, 1994.

    Google Scholar 

  6. T. G. Dietterich. Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation, 10(7):1895–1924, 1998.

    Article  Google Scholar 

  7. E. Gordon, and M. desJardin. Evaluation and Selection of Biases. Machine Learning, 20(1-2):5–22, 1995.

    Article  Google Scholar 

  8. A. Kalousis, and M. Hilario. Model Selection via Meta-learning: a Comparative Study. In Proceedings of the 12th International IEEE Conference on Tools with AI, Vancouver. IEEE press. 2000.

    Google Scholar 

  9. A. Kalousis, and M. Hilario. Feature Selection for Meta-Learning. In Proceedings of the 5th Pacific Asia Conference on Knowledge Discovery and Data Mining. 2001.

    Google Scholar 

  10. C. Koepf, C. Taylor, and J. Keller. Meta-analysis: Data characterisation for classification and regression on a meta-level. In Antony Unwin, Adalbert Wilhelm, and Ulrike Hofmann, editors, Proceedings of the International Symposium on Data Mining and Statistics, Lyon, France, (2000).

    Google Scholar 

  11. R. Kohavi. Scaling up the Accuracy of Naïve-bayes Classifier: a Decision Tree hybrid. 2nd Int. Conf. on Knowledge Discovery and Data Mining, 202–207. (1996)

    Google Scholar 

  12. M. Lagoudakis, and M. Littman. Algorithm selection using reinforcement learning. In Proceedings of the Seventeenth International Conference on Machine Learning (ICML-2000), 511–518, Stanford, CA. (2000)

    Google Scholar 

  13. C. Linder, and R. Studer. AST: Support for Algorithm Selection with a CBR Approach. Proceedings of the 16th International Conference on Machine Learning, Workshop on Recent Advances in Meta-Learning and Future Work. 1999.

    Google Scholar 

  14. D. Michie, D. Spiegelhalter, and C. Toylor. Machine Learning, Neural Network and Statistical Classification. Ellis Horwood Series in Artificial Intelligence. 1994.

    Google Scholar 

  15. T. Mitchell. Machine Learning. MacGraw Hill. 1997.

    Google Scholar 

  16. S. Salzberg. On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach. Data Mining and Knowledge Discovery, 1:3, 317–327, 1997.

    Article  Google Scholar 

  17. C. Schaffer. Selecting a Claassification Methods by Cross Validation, Machine Learning, 13, 135–143,1993.

    Google Scholar 

  18. C. Schaffer. Cross-validation, stacking and bi-level stacking: Meta-methods for classification learning. In P. Cheeseman and R. W. Oldford, editors, Selecting Models from Data: Artificial Intelligence and Statistics IV, 51–59, 1994.

    Google Scholar 

  19. B. Pfahringer, H. Bensusan, and C. Giraud-Carrier. Tell me who can learn you and I can tell you who you are: Landmarking various Learning Algorithms. Proceedings of the 17th Int. Conf. on Machine Learning. 743–750, 2000.

    Google Scholar 

  20. F. Provost, and B. Buchanan. Inductive policy: The pragmatics of bias selection. Machine Learning, 20:35–61, 1995.

    Google Scholar 

  21. J. R. Quinlan. C4.5: Programs for Machine Learning, Morgan Kaufman, 1993.

    Google Scholar 

  22. J. R. Quinlan. C5.0: An Informal Tutorial, RuleQuest, http//www.rulequest.com, 1998.

  23. L. Rendell, R. Seshu, and D. Tcheng. Layered Concept Learning and Dynamically Variable Bias Management. 10th Inter. Join Conference on AI. 308–314, 1987.

    Google Scholar 

  24. C. Schaffer. A Conservation Law for Generalization Performance. Proceedings of the 11th International Conference on Machine Learning, 1994.

    Google Scholar 

  25. C. Soares. Ranking Classification Algorithms on Past Performance. Master’s Thesis, Faculty of Economics, University of Porto, 2000.

    Google Scholar 

  26. C. Soares. Zoomed Ranking: Selection of Classification Algorithms based on Relevant Performance Information. Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery, 126–135, 2000.

    Google Scholar 

  27. S. Y. Sohn. Meta Analysis of Classification Algorithms for Pattern Recognition. IEEE Trans. on Pattern Analysis and Machine Intelligence, 21, 1137–1144, 1999.

    Article  Google Scholar 

  28. L. Todorvoski, and S. Dzeroski. Experiments in Meta-Level Learning with ILP. Proceedings of the 3th European Conference on Principles on Data Mining and Knowledge Discovery, 98–106, 1999.

    Google Scholar 

  29. A. Webster. Applied Statistics for Business and Economics, Richard D Irwin Inc, 779–784, 1992.

    Google Scholar 

  30. D. Wolpert. The lack of a Priori Distinctions between Learning Algorithms. Neural Computation, 8, 1341–1390, 1996.

    Article  Google Scholar 

  31. D. Wolpert. The Existence of a Priori Distinctions between Learning Algorithms. Neural Computation, 8, 1391–1420, 1996.

    Article  Google Scholar 

  32. H. Bensusan. God doesn’t always shave with Occam’s Razor-learning when and how to prune. In Proceedigs of the 10th European Conference on Machine Learning, 119–124, Berlin, Germany, 1998.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Peng, Y., Flach, P.A., Soares, C., Brazdil, P. (2002). Improved Dataset Characterisation for Meta-learning. In: Lange, S., Satoh, K., Smith, C.H. (eds) Discovery Science. DS 2002. Lecture Notes in Computer Science, vol 2534. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36182-0_14

Download citation

  • DOI: https://doi.org/10.1007/3-540-36182-0_14

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-00188-1

  • Online ISBN: 978-3-540-36182-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics