Abstract
Semi-supervised learning is frequently used when we have a small labeled training set but a large set of unlabeled samples. In this paper, we combine Hidden Markov Models and Transformation Based Learning in a semi-supervised learning approach. Self-training and Co-training are the two semi-supervised techniques that we apply to our scheme in order to classify Portuguese noun phrases. Our main goal here is to show that we can achieve effective noun phrase extraction using fewer tagged examples by applying a semi-supervised technique. Our models show good improvement with a small labeled corpus and little with a large one.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Miorelli, S.T.: Extra¸cão do sintagma nominal em senten¸cas em português. Master’s thesis, Pontifícia Universidade Católica, Porto Alegre - RS (2001)
Santos, C.N.: Aprendizado de máquina na identifica¸cão de sintagmas nominais: o caso do português brasileiro. Master’s thesis, IME, Rio de Janeiro - RJ (2005)
Pierce, D., Cardie, C.: Limitations of co-training for natural language learning from large datasets. In: Proceedings of the EMNLP (2001)
Freitas, M.C., Garrão, M., Oliveira, C., Santos, C.N., Silveira, M.: A anota¸cão de um corpus para o aprendizado supervisionado de um modelo de sn. In: Proceedings of the III TIL / XXV Congresso da SBC, São Leopoldo - RS (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Milidiú, R., Santos, C., Duarte, J., Rentería, R. (2006). Semi-supervised Learning for Portuguese Noun Phrase Extraction. In: Vieira, R., Quaresma, P., Nunes, M.d.G.V., Mamede, N.J., Oliveira, C., Dias, M.C. (eds) Computational Processing of the Portuguese Language. PROPOR 2006. Lecture Notes in Computer Science(), vol 3960. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11751984_21
Download citation
DOI: https://doi.org/10.1007/11751984_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-34045-4
Online ISBN: 978-3-540-34046-1
eBook Packages: Computer ScienceComputer Science (R0)