Abstract
In this paper we present a formalism for text annotation called WCCL Match. The need for a new formalism originates from our works related to Question Answering for Polish. We examined several existing formalisms to conclude that none of them fulfills our requirements. The new formalism was designed on top of an existing language for writing morphosyntactic functional expressions, namely WCCL. The major features of WCCL Match are: creation of new annotations, modification of existing ones, support for overlapping annotations, explicit access to tagset attributes and referring to context outside of captured annotation. We discuss three applications of the formalism: recognition of proper names, question analysis and question-to-query transformation. The implementation of WCCL Match is language-independent and can be used for almost any natural language.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V., Aswani, N., Roberts, I., Gorrell, G., Funk, A., Roberts, A., Damljanovic, D., Heitz, T., Greenwood, M.A., Saggion, H., Petrak, J., Li, Y., Peters, W.: Text Processing with GATE (Version 6) (2011), http://tinyurl.com/gatebook
Cunningham, H., Maynard, D., Tablan, V.: JAPE: a Java Annotation Patterns Engine. Tech. Rep. CS–00–10, University of Sheffield, Department of Computer Science (2000)
Drozdzynski, W., Krieger, H.U., Piskorski, J., Schäfer, U., Xu, F.: Shallow processing with unification and typed feature structures — foundations and applications. Künstliche Intelligenz 1, 17–23 (2004), http://www.kuenstliche-intelligenz.de/archiv/2004_1/sprout-web.pdf
Paşca, M.: Open-Domain Question Answering from Large Text Collections. University of Chicago Press (2003)
Przepiórkowski, A.: A comparison of two morphosyntactic tagsets of Polish. In: Koseska-Toszewa, V., Dimitrova, L., Roszko, R. (eds.) Representing Semantics in Digital Lexicography: Proceedings of MONDILEX Fourth Open Workshop, pp. 138–144. Warszawa (2009)
Przepiórkowski, A.: A preliminary formalismfor simultaneous rule-based tagging and partial parsing. In: Data Structures for Linguistic Resources and Applications: Proceedings of the Biennial GLDV Conference 2007, pp. 81–90. Gunter Narr Verlag, Tuebingen (2007)
Radziszewski, A., Śniatowski, T.: Maca — a configurable tool to integrate Polish morphological data. In: Proceedings of FreeRBM 2011 (2011)
Radziszewski, A., Wardyński, A., Śniatowski, T.: WCCL: A morpho-syntactic feature toolkit. In: Habernal, I., Matoušek, V. (eds.) TSD 2011. LNCS, vol. 6836, pp. 434–441. Springer, Heidelberg (2011)
Silberztein, M.: NooJ manual (2003), user’s manual available on-line at http://www.nooj4nlp.net
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Marcińczuk, M., Radziszewski, A. (2013). WCCL Match – A Language for Text Annotation. In: Kłopotek, M.A., Koronacki, J., Marciniak, M., Mykowiecka, A., Wierzchoń, S.T. (eds) Language Processing and Intelligent Information Systems. IIS 2013. Lecture Notes in Computer Science, vol 7912. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38634-3_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-38634-3_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38633-6
Online ISBN: 978-3-642-38634-3
eBook Packages: Computer ScienceComputer Science (R0)