skip to main content
research-article

When Errors Become the Rule: Twenty Years with Transformation-Based Learning

Published:01 April 2014Publication History
Skip Abstract Section

Abstract

Transformation-based learning (TBL) is a machine learning method for, in particular, sequential classification, invented by Eric Brill [Brill 1993b, 1995a]. It is widely used within computational linguistics and natural language processing, but surprisingly little in other areas.

TBL is a simple yet flexible paradigm, which achieves competitive or even state-of-the-art performance in several areas and does not overtrain easily. It is especially successful at catching local, fixed-distance dependencies and seamlessly exploits information from heterogeneous discrete feature types. The learned representation—an ordered list of transformation rules—is compact and efficient, with clear semantics. Individual rules are interpretable and often meaningful to humans.

The present article offers a survey of the most important theoretical work on TBL, addressing a perceived gap in the literature. Because the method should be useful also outside the world of computational linguistics and natural language processing, a chief aim is to provide an informal but relatively comprehensive introduction, readable also by people coming from other specialities.

Skip Supplemental Material Section

Supplemental Material

References

  1. Harold Abelson and Gerald J. Sussman. 1996. Structure and Interpretation of Computer Programs. MIT Press, Cambridge. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. John Aberdeen, John Burger, David Day, Lynette Hirschman, Patricia Robinson, and Marc Vilain. 1995. MITRE: description of the Alembic system used for MUC-6. In Proceedings of the 6th Conference on Message Understanding. Association for Computational Linguistics, 141--155. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Chinatsu Aone and Kevin Hausman. 1996. Unsupervised learning of a rule-based Spanish part of speech tagger. In Proceedings of the 16th Conference on Computational Linguistics, Vol. 1. Association for Computational Linguistics, 53--58. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Nezip F. Ayan, Bonnie J. Dorr, and Christof Monz. 2005. Alignment link projection using transformation-based learning. In Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 185--192. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Lalit R. Bahl, Peter F. Brown, Peter V. de Souza, and Robert L. Mercer. 1989. A tree-based statistical language model for natural language speech recognition. Acoustics, Speech and Signal Processing, IEEE Transactions 37, 7 (1989), 1001--1008.Google ScholarGoogle Scholar
  6. Markus Becker. 1998. Unsupervised part of speech tagging with extended templates. In Proceedings of ESSLLI 1998, Student Session.Google ScholarGoogle Scholar
  7. Gosse Bouma. 2000. A finite state and data oriented method for grapheme to phoneme conversion. In NAACL-2000. 303--310. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Gosse Bouma. 2003. Finite state methods for hyphenation. Natural Language Engineering 9 (2003), 5--20. DOI:http://dx.doi.org/10.1017/S1351324903003073 Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Leo Breiman. 1996. Bagging predictors. Machine Learning 24, 2 (1996), 123--140. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Leo Breiman, Jerome Friedman, Richard Olshen, and Charles Stone. 1984. Classification and Regression Trees. Wadsworth and Brooks, Monterrey, CA.Google ScholarGoogle Scholar
  11. Eric Brill. 1993a. Automatic grammar induction and parsing free text: A transformation-based approach. In Proceedings of the Workshop on Human Language Technology. Association for Computational Linguistics, 237--242. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Eric Brill. 1993b. A Corpus-Based Approach to Language Learning. Ph.D. Dissertation. University of Pennsylvania, Philadelphia, PA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Eric Brill. 1994. Some advances in transformation-based part of speech tagging. In Proceedings of the 12th National Conference on Artificial Intelligence. Arxiv preprint cmp-lg/9406010 (1994), 722--727. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Eric Brill. 1995a. Transformation-based error-driven learning and natural language processing: A case study in part of speech tagging. Computational Linguistics 21, 4 (1995), 543--565. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Eric Brill. 1995b. Unsupervised learning of disambiguation rules for part of speech tagging. In Proceedings of the 3rd Workshop on Very Large Corpora, Vol. 30. 1--13.Google ScholarGoogle Scholar
  16. Eric Brill. 1996. Learning to parse with transformations. In Recent Advances in Parsing Technology. Kluwer.Google ScholarGoogle Scholar
  17. Eric Brill and Philip Resnik. 1994. A rule-based approach to prepositional phrase attachment disambiguation. In Proceedings of COLING'94. 1198--1204. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Eric Brill and Jun Wu. 1998. Classifier combination for improved lexical disambiguation. In Proceedings of the 17th International Conference on Computational Linguistics, Vol. 1. Association for Computational Linguistics, 191--195. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Björn Bringmann, Stefan Kramer, Friedrich Neubarth, Hannes Pirker, and Gerhard Widmer. 2002. Transformation-based regression. In Machine Learning: International Workshop then Conference. Citeseer, 59--66. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Sandra Carberry, K. Vijay-Shanker, Andrew Wilson, and Ken Samuel. 2001. Randomized rule selection in transformation-based learning: A comparative study. Natural Language Engineering 7, 2 (2001), 99--116. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. John Carroll, Ted Briscoe, and Antonio Sanfilippo. 1998. Parser evaluation: A survey and a new proposal. In Proceedings of the 1st International Conference on Language Resources and Evaluation. 447--454.Google ScholarGoogle Scholar
  22. Rich Caruana. 1997. Multitask learning. Machine Learning 28, 1 (1997), 41--75. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. James R. Curran and Raymond K. Wong. 1999. Transformation-based learning for automatic translation from HTML to XML. In Proceedings of the 4th Australasian Document Computing Symposium (ADCS99). Citeseer.Google ScholarGoogle Scholar
  24. James R. Curran and Raymond K. Wong. 2000. Formalization of transformation-based learning. In ACSC. IEEE Computer Society, 51--57.Google ScholarGoogle Scholar
  25. Walter Daelemans. 1995. Memory-based lexical acquisition and processing. In Machine Translation and the Lexicon, P. Steffens (Ed.). Springer, Berlin, 85--98. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. David Day, John Aberdeen, Lynette Hirschman, Robyn Kozierok, Patricia Robinson, and Marc Vilain. 1997. Mixed-initiative development of language processing systems. In Proceedings of the Fifth Conference on Applied Natural Language Processing. Association for Computational Linguistics, 348--355. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Luca Dini, Vittorio Di Tomaso, and Frédérique Segond. 1998. Error driven word sense disambiguation. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Vol. 1. Association for Computational Linguistics, 320--324. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Cícero N. dos Santos. 2009. Entropy Guided Transformation Learning. Ph.D. Dissertation. Pontifícia Universidade Católica do Rio de Janeiro.Google ScholarGoogle Scholar
  29. Cícero N. dos Santos and Ruy L. Milidiú. 2007. Probabilistic classifications with TBL. In Computational Linguistics and Intelligent Text Processing, Alexander Gelbukh (Ed.). Lecture Notes in Computer Science, Vol. 4394. Springer, Berlin, 196--207. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Cícero N. dos Santos and Ruy L. Milidiú. 2009. Entropy guided transformation learning. Foundations of Computational Intelligence. 1, (2009), 159--184.Google ScholarGoogle Scholar
  31. Cícero N. dos Santos, Ruy L. Milidiú, Carlos E. M. Crestana, and Eraldo R. Fernandes. 2010. ETL Ensembles for Chunking, NER and SRL. In Computational Linguistics and Intelligent Text Processing, Alexander Gelbukh (Ed.). Lecture Notes in Computer Science, Vol. 6008. Springer, Berlin, 100--112. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Cícero N. dos Santos, Ruy L. Milidiú, and Raúl Rentería. 2008. Portuguese part-of-speech tagging using entropy guided transformation learning. In Computational Processing of the Portuguese Language, António Teixeira, Vera de Lima, Luís de Oliveira, and Paulo Quaresma (Eds.). Lecture Notes in Computer Science, Vol. 5190. Springer, Berlin, 143--152. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Cícero N. dos Santos and Claudia Oliveira. 2005. Constrained atomic term: Widening the reach of rule templates in transformation based learning. In EPIA(Lecture Notes in Computer Science), Carlos Bento, Amílcar Cardoso, and Gaël Dias (Eds.), Vol. 3808. Springer, 622--633. DOI:http://dx.doi.org/10.1007/ 11595014_61 Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Philip Edmonds. 2002. SENSEVAL: The evaluation of word sense disambiguation systems. ELRA Newsletter 7, 3 (2002), 5--14.Google ScholarGoogle Scholar
  35. Eraldo R. Fernandes, Cícero N. dos Santos, and Ruy L. Milidiú. 2010. A machine learning approach to Portuguese clause identification. Computational Processing of the Portuguese Language (2010), 55--64. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Radu Florian. 2002a. Named entity recognition as a house of cards: Classifier stacking. In Proceedings of the 6th Conference on Natural Language Learning. 1--4. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Radu Florian. 2002b. Transformation Based Learning and Data-Driven Lexical Disambiguation. Syntactic and Semantic Ambiguity Resolution. Ph.D. Dissertation, Johns Hopkins University. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Radu Florian, John Henderson, and Grace Ngai. 2000. Coaxing confidences from an old friend: Probabilistic classifications from transformation rule lists. In Proceedings of the 2000 Joint SIGDAT Conference on Empirical Methods in NLP and Very Large Corpora. Association for Computational Linguistics, 26--34. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Radu Florian, Abe Ittycheriah, Hongyan Jing, and Tong Zhang. 2003. Named entity recognition through classifier combination. In Proceedings of the 7th Conference on Natural Language Learning at HLT-NAACL 2003, Vol. 4. Association for Computational Linguistics, 171. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Radu Florian and Grace Ngai. 2001. Multidimensional transformation-based learning, In Proceedings of the 5th Workshop on Computational Language Learning (CoNLL-2001). CoRR cs.CL/0107021 (2001). Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Cameron Fordyce. 1998. Prosody Prediction for Speech Synthesis Using Transformational Rule-Based Learning. Master's Thesis, Boston University.Google ScholarGoogle Scholar
  42. Yoav Freund, Raj Iyer, Robert E. Schapire, and Yoram Singer. 2003. An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research 4 (2003), 933--969. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. William A. Gale, Kenneth W. Church, and David Yarowsky. 1992. A method for disambiguating word senses in a large corpus. Computers and the Humanities 26, 5--6 (1992), 415--439.Google ScholarGoogle ScholarCross RefCross Ref
  44. Daniel Hardt. 1998. Improving ellipsis resolution with transformation-based learning. In AAAI Fall Symposium.Google ScholarGoogle Scholar
  45. Daniel Hardt. 2001. Transformation-based learning of Danish grammar correction. In Proceedings of RANLP 2001, Tzigov Chark. Citeseer.Google ScholarGoogle Scholar
  46. Per Hedelin, Anders Jonsson, and Per Lindblad. 1987. Svenskt uttalslexikon (3rd ed.). Technical report. Chalmers University of Technology.Google ScholarGoogle Scholar
  47. Mark Hepple. 2000. Independence and commitment: Assumptions for rapid training and execution of rule-based POS taggers. In Proceedings of the 38th Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 278. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Paul Hudak. 1996. Building domain-specific embedded languages. ACM Computing Surveys (CSUR) 28, 4 (1996). Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Paul Hudak. 1998. Modular domain specific languages and tools. In Proceedings of the 5th International Conference on Software Reuse, P. Devanbu and J. Poulin (Eds.). IEEE Computer Society Press, 134--142. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Daniel Jurafsky and James H. Martin. 2008. An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition (2nd ed.). Prentice-Hall.Google ScholarGoogle Scholar
  51. Fred Karlsson, Atro Voutilainen, Juha Heikkilä, and Arto Anttila (Eds.). 1995. Constraint Grammar: A Language-Independent System for Parsing Unrestricted Text. Mouton de Gruyter. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Ergina Kavallieratou, Efstathios Stamatatos, Nikos Fakotakis, and George Kokkinakis. 2000. Handwritten character segmentation using transformation-based learning. In Proceedings of the 15th International Conference on Pattern Recognition (ICPR'00). 634--637.Google ScholarGoogle ScholarCross RefCross Ref
  53. Joungbum Kim, Sarah E. Schwarm, and Mari Ostendorf. 2004. Detecting structural metadata with decision trees and transformation-based learning. In Proceedings of HLT-NAACL04. 137--144.Google ScholarGoogle ScholarCross RefCross Ref
  54. Ludmila I. Kuncheva and Christopher J. Whitaker. 2003. Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning 51, 2 (2003), 181--207. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Torbjörn Lager. 1999a. μ-TBL Lite: A small, extensible transformation-based learner. In Proceedings of the 9th Conference of the European Chapter of the Association for Computational Linguistics (EACL'99). Bergen. Poster paper. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Torbjörn Lager. 1999b. The μ-TBL system: Logic programming tools for transformation-based learning. In Proceedings of CoNLL, Vol. 99.Google ScholarGoogle Scholar
  57. Torbjörn Lager. 2001. Transformation-based learning of rules for constraint grammar tagging. In 13th Nordic Conference in Computational Linguistics. Uppsala, Sweden, 21--22.Google ScholarGoogle Scholar
  58. Torbjörn Lager and Natalia Zinovjeva. 1999. Training a dialogue act tagger with the μ-TBL System. In Proceedings of the 3rd Swedish Symposium on Multimodal Communication. Linköping University Natural Language Processing Laboratory (NLPLAB).Google ScholarGoogle Scholar
  59. Niels Landwehr, Bernd Gutmann, Ingo Thon, Luc De Raedt, and Matthai Philipose. 2008. Relational transformation-based tagging for human activity recognition. Fundamenta Informaticae 89, 1 (2008), 111--129. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Xin Li, Xuan-Jing Huang, and Li-de Wu. 2006. Question classification by ensemble learning. IJCSNS 6, 3 (2006), 147.Google ScholarGoogle Scholar
  61. Nikolaj Lindberg and Martin Eineborg. 1998. Learning constraint grammar-style disambiguation rules using inductive logic programming. In Proceedings of the 17th International Conference on Computational Linguistics. Association for Computational Linguistics, 775--779. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Lidia Mangu and Eric Brill. 1997. Automatic rule acquisition for spelling correction. In Machine Learning -- International Workshop then Conference. Citeseer, 187--194. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Christopher D. Manning and Hinrich Schütze. 2001. Foundations of Statistical Natural Language Processing. MIT Press, Cambridge, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Andrei Mikheev. 1997. Automatic rule induction for unknown-word guessing. Computational Linguistics 23, 3 (1997), 405--423. Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Ruy Luiz Milidiú, C. E. M. Crestana, and Cícero Nogueira dos Santos. 2010. A token classification approach to dependency parsing. In Proceedings of the 7th Brazilian Symposium on Information and Human Language Technology (STIL'09). IEEE, 80--88. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Ruy L. Milidiú, Cícero N. dos Santos, and Julio C. Duarte. 2008. Phrase chunking using entropy guided transformation learning. In Proceedings of ACL 2008. Citeseer.Google ScholarGoogle Scholar
  67. Ruy L. Milidiú, Julio C. Duarte, and Cícero N. dos Santos. 2007. Evolutionary TBL template generation. Journal of the Brazilian Computer Society 13(4) (2007), 39--50.Google ScholarGoogle Scholar
  68. Tom Mitchell. 1997. Machine Learning. McGraw-Hill. Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. Un Yong Nahm. 2005. Transformation-based information extraction using learned meta-rules. Computational Linguistics and Intelligent Text Processing (2005), 535--538. Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Lee Naish. 1996. Higher-order logic programming in Prolog. In Proceedings of the Workshop on Multi-Paradigm Logic Programming, JICSLP, Vol. 96.Google ScholarGoogle Scholar
  71. Grace Ngai and Radu Florian. 2001a. Transformation-based learning in the fast lane. In Proceedings of the 2nd Meeting of the North American Chapter of the Association for Computational Linguistics on Language Technologies 2001. Association for Computational Linguistics, 8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  72. Grace Ngai and Radu Florian. 2001b. Transformation Based Learning in the Fast Lane: A Generative Approach. Technical Report. Center for Speech and Language Processing, Johns Hopkins University.Google ScholarGoogle Scholar
  73. Kemal Oflazer and Gökhan Tür. 1996. Combining hand-crafted rules and unsupervised learning in constraint-based morphological disambiguation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. 69--81.Google ScholarGoogle Scholar
  74. Jonathan Oliver. 1992. Decision Graphs: An Extension of Decision Trees. Technical Report 92/173. Department of Computer Science, Monash University.Google ScholarGoogle Scholar
  75. David D. Palmer. 1997. A trainable rule-based algorithm for word segmentation. In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 321--328. Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. Seong-Bae Park, Jeong-Ho Chang, and Byoung-Tak Zhang. 2004. Korean compound noun decomposition using syllabic information only. Computational Linguistics and Intelligent Text Processing (2004), 146--157.Google ScholarGoogle Scholar
  77. Fernando Pereira and Yves Schabes. 1992. Inside-outside reestimation from partially bracketed corpora. In ACL. Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. J. Ross Quinlan. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann. Google ScholarGoogle ScholarDigital LibraryDigital Library
  79. Lance A. Ramshaw and Mitchell P. Marcus. 1994. Exploring the statistical derivation of transformational rule sequences for part-of-speech tagging. In Proceedings of the ACL Workshop on Combining Symbolic and Statistical Approaches to Language. 128--135.Google ScholarGoogle Scholar
  80. Lance A. Ramshaw and Mitchell P. Marcus. 1995. Text chunking using transformation-based learning. In Proceedings of the ACL 3rd Workshop on Very Large Corpora, David Yarowsky and Kenneth W. Church (Eds.), Vol. cmp-lg/9505040. Association of Computational Linguistics, Somerset, NJ, 82--94.Google ScholarGoogle Scholar
  81. Ronald Rivest. 1987. Learning decision lists. Machine Learning 2, 3 (1987), 229--246. Google ScholarGoogle ScholarDigital LibraryDigital Library
  82. Emmanuel Roche and Yves Schabes. 1995. Deterministic part-of-speech tagging with finite-state transducers. Computational Linguistics 21, 2 (1995), 227--253. Google ScholarGoogle ScholarDigital LibraryDigital Library
  83. Dan Roth. 1998. Learning to resolve natural language ambiguities: A unified approach. In Proceedings of the National Conference on Artificial Intelligence. John Wiley & Sons Ltd., 806--813. Google ScholarGoogle ScholarDigital LibraryDigital Library
  84. Tobias Ruland. 2000. A context-sensitive model for probabilistic LR parsing of spoken language with transformation-based postprocessing. In Proceedings of the 18th Conference on Computational Linguistics, Vol. 2. Association for Computational Linguistics, 677--683. Google ScholarGoogle ScholarDigital LibraryDigital Library
  85. Ken Samuel. 1998a. Discourse learning: Dialogue act tagging with transformation-based learning. In Proceedings of the National Conference on Artificial Intelligence. John Wiley and Sons, Ltd., 1199--1199. Google ScholarGoogle ScholarDigital LibraryDigital Library
  86. Ken Samuel. 1998b. Lazy transformation-based learning. In Proceedings of the 11th International Florida Artificial Intelligence Research Society Conference. AAAI Press, 235--239. Google ScholarGoogle ScholarDigital LibraryDigital Library
  87. Ken Samuel, Sandra Carberry, and K. Vijay-Shanker. 1998. An investigation of transformation-based learning in discourse. In Machine Learning: Proceedings of the 15th International Conference. Google ScholarGoogle ScholarDigital LibraryDigital Library
  88. Christer Samuelsson, Pasi Tapanainen, and Atro Voutilainen. 1996. Inducing constraint grammars. Grammatical Interference: Learning Syntax from Sentences (1996), 146--155. Google ScholarGoogle ScholarDigital LibraryDigital Library
  89. Erik Tjong, Kim Sang, and Jorn Veenstra. 1999. Representing text chunks. In Proceedings of the 9th Conference on European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 173--179. Google ScholarGoogle ScholarDigital LibraryDigital Library
  90. Yoshimasa Tsuruoka, John McNaught, and Sophia Ananiadou. 2008. Normalizing biomedical terms by minimizing ambiguity and variability. BMC Bioinformatics 9, Suppl 3 (2008), S2.Google ScholarGoogle ScholarCross RefCross Ref
  91. Leslie G. Valiant. 1984. A theory of the learnable. Communication ACM 27, 11 (1984), 1134--1142. Google ScholarGoogle ScholarDigital LibraryDigital Library
  92. Arie van Deursen, Paul Klint, and Joost Visser. 2000. Domain-specific languages: An annotated bibliography. ACM SIGPLAN Notices 35, 6 (2000), 26--36. Google ScholarGoogle ScholarDigital LibraryDigital Library
  93. Ken Williams, Christopher Dozier, and Andrew McCulloh. 2004. Learning transformation rules for semantic role labeling. In Proceedings of CoNLL-2004.Google ScholarGoogle Scholar
  94. Garnett Wilson and Malcolm Heywood. 2005. Use of a genetic algorithm in Brill's transformation-based part-of-speech tagger. In GECCO'05: Proceedings of the 2005 Conference on Genetic and Evolutionary Computation. ACM, New York, NY, 2067--2073. DOI:http://dx.doi.org/10.1145/1068009.1068352 Google ScholarGoogle ScholarDigital LibraryDigital Library
  95. David Wolpert. 1992. Stacked generalization. Neural Networks 5(2) (1992), 241260. Google ScholarGoogle ScholarDigital LibraryDigital Library
  96. Dekai Wu, Grace Ngai, and Marine Carpuat. 2004. Raising the bar: Stacked conservative error correction beyond boosting. In Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC-2004). Lisbon.Google ScholarGoogle Scholar
  97. George K. Zipf. 1949. Human Behavior and the Principle of Least Effort. Addison-Wesley.Google ScholarGoogle Scholar
  98. Win Zonneveld, Mieke Trommelen, Michael Jessen, Curtis Rice, Gösta Bruce, and Kristjan Arnason. 1999. Wordstress in West-Germanic and North-Germanic languages. In Word Prosodic Systems in the Languages of Europe, Harry van der Hulst (Ed.). Walter de Gruyter, Chapter 8, 477--604.Google ScholarGoogle Scholar

Index Terms

  1. When Errors Become the Rule: Twenty Years with Transformation-Based Learning

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            Full Access

            • Published in

              cover image ACM Computing Surveys
              ACM Computing Surveys  Volume 46, Issue 4
              April 2014
              463 pages
              ISSN:0360-0300
              EISSN:1557-7341
              DOI:10.1145/2597757
              Issue’s Table of Contents

              Copyright © 2014 ACM

              Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

              Publisher

              Association for Computing Machinery

              New York, NY, United States

              Publication History

              • Published: 1 April 2014
              • Accepted: 1 October 2013
              • Revised: 1 September 2013
              • Received: 1 October 2012
              Published in csur Volume 46, Issue 4

              Permissions

              Request permissions about this article.

              Request Permissions

              Check for updates

              Qualifiers

              • research-article
              • Research
              • Refereed
            • Article Metrics

              • Downloads (Last 12 months)10
              • Downloads (Last 6 weeks)1

              Other Metrics

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader