Entropy-Guided Feature Generation for Structured Learning of Portuguese Dependency Parsing

Fernandes, Eraldo R.; Milidiú, Ruy L.

doi:10.1007/978-3-642-28885-2_17

Eraldo R. Fernandes^23,24 &
Ruy L. Milidiú²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7243))

Included in the following conference series:

International Conference on Computational Processing of the Portuguese Language

1214 Accesses

Abstract

Feature generation is a difficult, yet highly necessary, subtask of machine learning modeling. Usually, it is partially solved by a domain expert that generates complex and discriminative feature templates by conjoining the available basic features. This is a limited and expensive way to obtain feature templates and is recognized as a modeling bottleneck. In this work, we propose an automatic method to generate feature templates for structured learning algorithms. The method receives as input the training dataset with basic features and produces a set of feature templates by conjoining basic features that are highly discriminative together. We denote this method entropy guided since it is based on the conditional entropy of local decision variables given the feature values. We illustrate our approach on the Portuguese dependency parsing task and report on experiments with the Bosque corpus. We show that the entropy-guided templates outperform the manually built templates used by MSTParser, which was the best performing system on the Bosque corpus up to now. Furthermore, our approach allows an effortless inclusion of two new basic features that automatically generate additional templates. As a result, our system achieves a per-token accuracy of 92.66%, what represents a reduction by more than 15% on the previous smallest error rate for Portuguese dependency parsing.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Altun, Y., Hofmann, T., Tsochantaridis, I.: SVM learning for interdependent and structured output spaces. In: Machine Learning with Structured Outputs (2007)
Google Scholar
Altun, Y., Tsochantaridis, I., Hofmann, T.: Hidden Markov support vector machines. In: Proceedings of the International Conference on Machine Learning (2003)
Google Scholar
Buchholz, S., Marsi, E.: CoNLL-X shared task on multilingual dependency parsing. In: Proceedings of the Tenth Conference on Natural Language Learning. pp. 149–164 (2006)
Google Scholar
Chu, Y.J., Liu, T.H.: On the shortest arborescence of a directed graph. Science Sinica 14, 1396–1400 (1965)
MATH Google Scholar
Ciaramita, M., Altun, Y.: Broad-coverage sense disambiguation and information extraction with a supersense sequence tagger. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, EMNLP 2006, pp. 594–602 (2006)
Google Scholar
Collins, M.: Ranking algorithms for named-entity extraction: Boosting and the voted perceptron. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics (2002)
Google Scholar
Collins, M.: Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms. In: Proceedings of the ACL 2002 Conference on Empirical Methods in Natural Language Processing. pp. 1–8 (2002)
Google Scholar
Comon, P.: Independent component analysis, a new concept? Signal Processing 36(3), 287–314 (1994)
Article MATH Google Scholar
Crammer, K., Singer, Y.: Ultraconservative online algorithms for multiclass problems. Journal of Machine Learning Research 3, 2003 (2001)
Google Scholar
Edmonds, J.: Optimum branchings. Journal of Research of the National Bureau of Standards 71B, 233–240 (1967)
MathSciNet Google Scholar
Fernandes, E.R., dos Santos, C.N., Milidiú, R.L.: A Machine Learning Approach to Portuguese Clause Identification. In: Pardo, T.A.S., Branco, A., Klautau, A., Vieira, R., de Lima, V.L.S. (eds.) PROPOR 2010. LNCS, vol. 6001, pp. 55–64. Springer, Heidelberg (2010)
Chapter Google Scholar
Freitas, C., Rocha, P., Bick, E.: Floresta Sintá(c)tica: Bigger, Thicker and Easier. In: Teixeira, A., de Lima, V.L.S., de Oliveira, L.C., Quaresma, P. (eds.) PROPOR 2008. LNCS (LNAI), vol. 5190, pp. 216–219. Springer, Heidelberg (2008)
Chapter Google Scholar
Hacioglu, K.: Semantic role labeling using dependency trees. In: Proceedings of the 20th International Conference on Computational Linguistics (2004)
Google Scholar
Liang, P., Bouchard-côté, A., Klein, D., Taskar, B.: An end-to-end discriminative approach to machine translation. In: Proceedings of the Joint International Conference on Computational Linguistics and Association of Computational Linguistics, pp. 761–768 (2006)
Google Scholar
McDonald, R., Crammer, K., Pereira, F.: Online large-margin training of dependency parsers. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, ACL 2005, pp. 91–98 (2005)
Google Scholar
Mcdonald, R., Lerman, K., Pereira, F.: Multilingual dependency analysis with a two-stage discriminative parser. In. In: Proceedings of the Conference on Computational Natural Language Learning, CoNLL, pp. 216–220 (2006)
Google Scholar
Mcdonald, R., Pereira, F.: Online learning of approximate dependency parsing algorithms. In: Proc. of EACL, pp. 81–88 (2006)
Google Scholar
Novikoff, A.B.: On convergence proofs on perceptrons. In: Proceedings of the Symposium on the Mathematical Theory of Automata (1962)
Google Scholar
Pearson, K.: On lines and planes of closest fit to systems of points in space. Philosophical Magazine 2(6), 559–572 (1901)
Google Scholar
Quinlan, J.R.: C4.5: Programs for Machine Learning (Morgan Kaufmann Series in Machine Learning), 1st edn. Morgan Kaufmann (1992)
Google Scholar
Rosenblatt, F.: The perceptron: A probabilistic model for information storage and organization in the brain. Psych. Rev. 65, 386–407 (1958), Reprinted in Neurocomputing. MIT Press (1988)
Article MathSciNet Google Scholar
dos Santos, C.N., Milidiú, R.L.: Entropy Guided Transformation Learning. In: Hassanien, A.-E., Abraham, A., Vasilakos, A.V., Pedrycz, W. (eds.) Foundations of Computational, Intelligence Volume 1. SCI, vol. 201, pp. 159–184. Springer, Heidelberg (2009)
Chapter Google Scholar
Su, J., Zhang, H.: A fast decision tree learning algorithm. In: Proceedings of the 21st National Conference on Artificial Intelligence, pp. 500–505 (2006)
Google Scholar
Tarjan, R.E.: Finding optimum branchings. Networks 7, 25–25 (1977)
Article MathSciNet MATH Google Scholar
Taskar, B., Guestrin, C., Koller, D.: Max–margin Markov networks. In: Advances in Neural Information Processing Systems (2004)
Google Scholar
Taskar, B., Klein, D., Collins, M., Koller, D., Manning, C.: Max–margin parsing. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

PUC-Rio, Rio de Janeiro, Brazil
Eraldo R. Fernandes & Ruy L. Milidiú
IFG, Jataí, Brazil
Eraldo R. Fernandes

Authors

Eraldo R. Fernandes
View author publications
You can also search for this author in PubMed Google Scholar
Ruy L. Milidiú
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

UFSCAR, Rod. Washington Luís, 13565-905, São Carlos, Brazil
Helena Caseli
UFRGS, Av. Bento Gonçalves, 9500, 91501-970, Porto Alegre, Brazil
Aline Villavicencio
DETI/IEETA, Universidade de Aveiro, Campus Universitário de Santiago, 3810-193, Aveiro, Portugal
António Teixeira
UC/ IT, DEEC, Universidade de Coimbra, Polo 2, 3030-290, Coimbra, Portugal
Fernando Perdigão

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fernandes, E.R., Milidiú, R.L. (2012). Entropy-Guided Feature Generation for Structured Learning of Portuguese Dependency Parsing. In: Caseli, H., Villavicencio, A., Teixeira, A., Perdigão, F. (eds) Computational Processing of the Portuguese Language. PROPOR 2012. Lecture Notes in Computer Science(), vol 7243. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28885-2_17

Download citation

DOI: https://doi.org/10.1007/978-3-642-28885-2_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28884-5
Online ISBN: 978-3-642-28885-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics