IGTree: Using Trees for Compression and Classification in Lazy Learning Algorithms

Daelemans, Walter; Van Den Bosch, Antal; Weijters, Ton

doi:10.1023/A:1006506017891

IGTree: Using Trees for Compression and Classification in Lazy Learning Algorithms

Published: February 1997

Volume 11, pages 407–423, (1997)
Cite this article

Artificial Intelligence Review Aims and scope Submit manuscript

Walter Daelemans¹,
Antal Van Den Bosch² &
Ton Weijters²

218 Accesses
67 Citations
Explore all metrics

Abstract

We describe the IGTree learning algorithm, which compresses an instance base into a tree structure. The concept of information gain is used as a heuristic function for performing this compression. IGTree produces trees that, compared to other lazy learning approaches, reduce storage requirements and the time required to compute classifications. Furthermore, we obtained similar or better generalization accuracy with IGTree when trained on two complex linguistic tasks, viz. letter–phoneme transliteration and part-of-speech-tagging, when compared to alternative lazy learning and decision tree approaches (viz., IB1, information-gain-weighted IB1, and C4.5). A third experiment, with the task of word hyphenation, demonstrates that when the mutual differences in information gain of features is too small, IGTree as well as information-gain-weighted IB1 perform worse than IB1. These results indicate that IGTree is a useful algorithm for problems characterized by the availability of a large number of training instances described by symbolic features with sufficiently differing information gain values.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Natural language processing: state of the art, current trends and challenges

Article 14 July 2022

Natural Language Processing

Near-term advances in quantum natural language processing

Article 11 April 2024

References

Aha, D. W., Kibler, D., & Albert, M. (1991). Instance-Based Learning Algorithms. Machine Learning 7: 37–66.
Google Scholar
Aha, D. W. (1992). Generalizing from Case Studies: A Case Study. In Proceedings of the Ninth International Conference on Machine Learning, 1–10. Aberdeen, Scotland: Morgan Kaufmann.
Google Scholar
Cardie, C. (1993). A Case-Based Approach to Knowledge Acquisition for Domain-Specific Sentence Analysis. In Proceedings of the Eleventh National Conference on Artificial Intelligence, pp. 798–803, San Jose, CA: AAAI Press.
Google Scholar
Daelemans, W. (1995). Memory-based Lexical Acquisition and Processing. In Steffens, P. (ed.) Machine Translation and the Lexicon, Lecture Notes in Artificial Intelligence, 898. Springer: Berlin.
Google Scholar
Daelemans, W. & Van den Bosch, A. (1992). Generalisation Performance of Backpropagation Learning on a Syllabification Task. In Drossaers, M. & Nijholt, A. (eds.) TWLT3: Connectionism and Natural Language Processing. Enschede: Twente University.
Google Scholar
Daelemans, W. & Van den Bosch, A. (1994). A Language-Independent, Data-Oriented Architecture for Grapheme-to-Phoneme Conversion. In Proceedings of ESCA-IEEE Speech Synthesis Conference '94. New York.
Deng, K. & Moore, A. W. (1995). Multiresolution Instance-Based Learning. In Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence. Montreal: Morgan Kaufmann.
Google Scholar
Dougherty, J., Kohavi, R. & Sahami, M. (1995). Supervised and Unsupervised Discretization of Continuous Features. In Proceedings of the Twelfth International Conference on Machine Learning, pp. 194–202, Tahoe City, CA: Morgan Kaufmann.
Google Scholar
Friedman, J., Bentley, J. & Ari Finkel, R. (1977). An Algorithm for Finding Best Matches in Logarithmic Expected Time. ACM Transactions on Mathematical Software 3(3).
Kitano, H. (1993). Challenges of Massive Parallelism. In Proceedings of the Thirteenth International Conference on Artificial Intelligence, pp. 813–834, Chembery, France: Morgan Kaufmann.
Google Scholar
Kohavi, R. & Li, C-H. (1995). Oblivious Decision Trees, Graphs, and Top-Down Pruning. Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, 1071–1077. Montreal: Morgan Kaufmann.
Google Scholar
Langley, P. & Sage, S. (1994). Oblivious Decision Trees and Abstract Cases. In Aha, D. W. (ed.) Case-Based Reasoning: Papers from the 1994 Workshop (Technical Report WS–94–01). Menlo Park, CA: AAAI Press.
Google Scholar
Nunn, A. & van Heuven, V. J. (1993). Morphon, Lexicon-Based Text-to-Phoneme Conversion and Phonological Rules. In van Heuven, V. J. & Pols, L. C. (eds.) Analysis and Synthesis of Speech: Strategic Research Towards High-Quality Text-to-Speech Generation. Berlin: Mouton de Gruyter.
Google Scholar
Omohundro, S. M. (1991). Bumptrees for Efficient Function, Constraint, and Classification Learning. In Lippmann, R. P., Moody J. E. & Touretzky, D. S. (eds.) Advances in Neural Information Processing Systems 3. San Mateo, CA: Morgan Kaufmann.
Google Scholar
Quinlan, J. (1993). C4.5: Programs for Machine Learning. San Mateo, CA: Morgan Kaufmann.
Google Scholar
Rumelhart, D. E., Hinton, G. E. & Williams, R. J. (1986). Learning Internal Representations by Error Propagation. In Rumelhart, D. E. & McClelland, J. L. (eds.) Parallel Distributed Processing: Explorations in the Microstructure of Cognition, volume 1: Foundations. Cambridge, MA: The MIT Press.
Google Scholar
Sejnowski, T. J. & Rosenberg, C. S. (1987). Parallel Networks that Learn to Pronounce English Text. Complex Systems 1: 145–168.
Google Scholar
Stanfill, C. & Waltz, D. (1986). Toward Memory-Based Reasoning. Communications of the ACM 29: 1212–1228.
Google Scholar
Van den Bosch, A. & Daelemans, W. (1993). Data-Oriented Methods for Grapheme-to-Phoneme Conversion. In Proceedings of the 6th Conference of the EACL, 45–53. Utrecht: OTS.
Google Scholar
Weijters, A. & Hoppenbrouwers, G. (1990). NetSpraak: een neuraal netwerk voor grafeemfoneem-omzetting. Tabu, 20(1): 1–25.
Google Scholar
Weijters, A. (1991). A Simple Look-Up Procedure Superior to NETtalk? In Proceedings of the International Conference on Artificial Neural Networks. Finland: Espoo.
Google Scholar
Wess, S., Althoff, K. D. & Derwand, G. (1994). Using k-d Trees to Improve the Retrieval Step in Case-Based Reasoning. In Wess, S., Althoff K. D. & Richter, M. M. (eds.) Topics in Case-Based Reasoning. Berlin: Springer Verlag.
Google Scholar
Wess, S. (1995). Fallbasiertes Problemlösen in wissensbasierten Systemen zur Entscheidungsunterstützung und Diagnostik. Doctoral Dissertation, University of Kaiserslautern.
Wolpert, D. H. (1990). Constructing a Generalizer Superior to NETtalk via a Mathematical Theory of Generalization. Neural Networks 3: 445–452.
Google Scholar

Download references

Author information

Authors and Affiliations

Computational Linguistics, Tilburg University, The Netherlands. E-mail
Walter Daelemans
MATRIKS, Maastricht University, The Netherlands. E-mail
Antal Van Den Bosch & Ton Weijters

Authors

Walter Daelemans
View author publications
You can also search for this author in PubMed Google Scholar
Antal Van Den Bosch
View author publications
You can also search for this author in PubMed Google Scholar
Ton Weijters
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Daelemans, W., Van Den Bosch, A. & Weijters, T. IGTree: Using Trees for Compression and Classification in Lazy Learning Algorithms. Artificial Intelligence Review 11, 407–423 (1997). https://doi.org/10.1023/A:1006506017891

Download citation

Issue Date: February 1997
DOI: https://doi.org/10.1023/A:1006506017891

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

IGTree: Using Trees for Compression and Classification in Lazy Learning Algorithms

Abstract

Access this article

Similar content being viewed by others

Natural language processing: state of the art, current trends and challenges

Natural Language Processing

Near-term advances in quantum natural language processing

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

IGTree: Using Trees for Compression and Classification in Lazy Learning Algorithms

Abstract

Access this article

Similar content being viewed by others

Natural language processing: state of the art, current trends and challenges

Natural Language Processing

Near-term advances in quantum natural language processing

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation