Abstract
Relation extraction is the task of finding semantic relations between entities from text. This paper presents our approach to relation extraction for Vietnamese text using Conditional Random Field. The features used in the system are words, part-of-speech tag, entity type, type of other entities in the sentence, entity’s index and contextual information. In order to evaluate the effect of the contextual information to the system performance, different window sizes have been tested in our experiments. It shown that the system performance is affected by the window size, but it is not directly proportional to the F-score of the system. Our future work includes: (i) testing the system with a larger corpus in order to get a more accurate evaluation of the system; (ii) investigating other features used in the CRF algorithm to increase the system performance; and (iii) researching methods to extract relations outside the sentence’s scope.
Keywords
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Banko, M., Etzioni, O.: The tradeoffs between open and traditional relation extraction. In: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics, pp. 28–36 (2008)
Culotta, A., McCallum, A., Betz, J.: Integrating probabilistic extraction models and data mining to discover relations and patterns in text. In: Proceedings of HLT-NAACL 2006, pp. 296–303 (2006)
Giuliano, C., Lavelli, A., Romano, L.: Exploiting Shallow Linguistic Information for Relation Extraction from Biomedical Literature. In: Proceedings of EACL (2006)
Giuliano, C., Lavelli, A., Romano, L.: Relation extraction and the influence of automatic named-entity recognition. ACM Transactions on Speech and Language Processing (TSLP) 5(1) (2007)
Kambhatla, N.: Combining Lexical, Syntactic, and Semantic Features with Maximum Entropy Models for Extracting Relations. In: Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (2004)
Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of ICML, pp. 282–290 (2001)
McCallum, A.: Efficiently Inducing Features of Conditional Random Fields. In: Nineteenth Conference on Uncertainty in Artificial Intelligence (2003)
Malouf, R.: A comparison of algorithms for maximum entropy parameter estimation. In: Sixth Workshop on Computational Language Learning (2002)
Sha, F., Pereira, F.: Shallow Parsing with Conditional Random Fields. In: Proceeding of Human Language Technology NAACL (2003)
Skounakis, M., Craven, M., Ray, S.: Hierarchical hidden Markov models for information extraction. In: Proceedings of the 18th International Joint Conference on Artificial Intelligence, Mexico, Acapulco (2003)
Tran, M.V., Nguyen, V.V., Pham, T.U., Tran, T.O., Ha, Q.T.: An Experimental Study of Vietnamese Question Answering System. In: Proceedings of the International Conference on Asian Language Processing, pp. 152–155 (2009)
Zhao, S., Grishman, R.: Extracting relations with integrated information using kernel methods. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, pp. 419–426 (2005)
Zhou, G., Su, J., Zhang, J., Zhang, M.: Exploring various knowledge in relation extraction. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics, pp. 427–434 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sam, R.C., Le, H.T., Nguyen, T.T., Trinh, T.M. (2010). Relation Extraction in Vietnamese Text Using Conditional Random Fields. In: Cheng, PJ., Kan, MY., Lam, W., Nakov, P. (eds) Information Retrieval Technology. AIRS 2010. Lecture Notes in Computer Science, vol 6458. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17187-1_32
Download citation
DOI: https://doi.org/10.1007/978-3-642-17187-1_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17186-4
Online ISBN: 978-3-642-17187-1
eBook Packages: Computer ScienceComputer Science (R0)