Abstract
This paper shows how the best data-driven dependency parsers available today [1] can be improved by learning from unlabeled data. We focus on German and Swedish and show that labeled attachment scores improve by 1.5%-2.5%. Error analysis shows that improvements are primarily due to better recovery of long distance dependencies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Martins, A., Das, D., Smith, N., Xing, E.: Stacking dependency parsers. In: EMNLP, Honolulu, Hawaii (2008)
Rimell, L., Clark, S., Steedman, M.: Unbounded dependency recovery for parser evaluation. In: EMNLP, Singapore (2009)
Abney, S.: Semi-supervised learning for computational linguistics. Chapman and Hall, Boca Raton (2008)
Wolpert, D.: Stacked generalization. Neural Networks 5, 241–259 (1992)
Sagae, K., Lavie, A.: Parser combination by reparsing. In: HLT-NAACL, New York City, NY (2006)
Hall, J.: colleagues: Single malt or blended? In: CONLL, Prague, Czech Republic (2007)
Nivre, J., McDonald, R.: Integrating graph-based and transition-based dependency parsers. In: ACL-HLT, Columbus, Ohio (2008)
Fishel, M., Nivre, J.: Voting and stacking in data-driven dependency parsing. In: NODALIDA, Odense, Denmark (2009)
Surdeanu, M., Manning, C.: Ensemble models for dependency parsing: cheap and good? In: NAACL, Los Angeles, CA (2010)
Li, M., Zhou, Z.H.: Tri-training: exploiting unlabeled data using three classifiers. IEEE Transactions on Knowledge and Data Engineering 17(11), 1529–1541 (2005)
Koo, T., Carreras, X., Collins, M.: Simple semi-supervised dependency parsing. In: ACL, Columbus, Ohio (2008)
Wang, Q., Lin, D., Schuurmans, D.: Semi-supervised convex training for dependency parsing. In: ACL, Columbus, Ohio (2008)
Suzuki, J., Isozaki, H., Carreras, X., Collins, M.: Semi-supervised convex training for dependency parsing. In: EMNLP, Singapore (2009)
Sagae, K., Tsujii, J.: Dependency parsing and domain adaptation with lr models and parser ensembles. In: EMNLP-CONLL, Prague, Czech Republic (2007)
Chen, W., Zhang, Y., Isahara, H.: Chinese chunking with tri-training learning. In: Computer processing of oriental languages, pp. 466–473. Springer, Berlin (2006)
Nguyen, T., Nguyen, L., Shimazu, A.: Using semi-supervised learning for question classification. Journal of Natural Language Processing 15, 3–21 (2008)
Sindhwani, V., Keerthi, S.: Large scale semi-supervised linear SVMs. In: ACM SIGIR, Seattle, WA (2006)
McDonald, R., Pereira, F., Ribarov, K., Hajič, J.: Non-projective dependency parsing using spanning tree algorithms. In: HLT-EMNLP 2005, Vancouver, British Columbia (2005)
Nivre, J.: Colleagues: MaltParser. Natural Language Engineering 13(2), 95–135 (2007)
Breiman, L.: Random forests. Machine Learning 45, 5–32 (2001)
Brants, S., Hansen, S., Lezius, W., Smith, G.: The TIGER treebank. In: TLT, Sozopol, Bulgaria (2002)
Nilsson, J., Hall, J., Nivre, J.: MAMBA meets TIGER: Reconstructing a Swedish treebank from antiquity. In: NODALIDA, Joensuu, Finland (2005)
Gimenez, J., Marquez, L.: SVMTool: a general POS tagger generator based on support vector machines. In: LREC, Lisbon, Portugal (2004)
Eisner, J.: Three new probabilistic models for dependency parsing. In: COLING, Copenhagen, Denmark (1996)
Zeman, D., Žabokrtský, Z.: Improving parsing accuracy by combining diverse dependency parsers. In: IWPT, Vancouver, Canada (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Søgaard, A., Rishøj, C. (2010). The Effect of Semi-supervised Learning on Parsing Long Distance Dependencies in German and Swedish. In: Loftsson, H., Rögnvaldsson, E., Helgadóttir, S. (eds) Advances in Natural Language Processing. NLP 2010. Lecture Notes in Computer Science(), vol 6233. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14770-8_44
Download citation
DOI: https://doi.org/10.1007/978-3-642-14770-8_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14769-2
Online ISBN: 978-3-642-14770-8
eBook Packages: Computer ScienceComputer Science (R0)