Abstract
In this paper we discuss an approach to named entity recognition (NER) based on grammatical inference (GI). Previous GI approaches have aimed at constructing a grammar underlying a given text source. It has been noted that the rules produced by GI can also be interpreted semantically [16] where a non-terminal describes interchangeable elements which are the instances of the same concepts. Such an observation leads to the hypothesis that GI might be useful for finding concept instances in a text. Furthermore, it should also be possible to discover relations between concepts, or more precisely, the way such relations are expressed linguistically.
Throughout the paper, we propose a general framework for using GI for named entity recognition by discussing several possible approaches. In addition, we demonstrate that these methods successfully work on biomedical data using an existing GI tool.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Adriaans, P., van Zaanen, M.: Computational Grammar Induction for Linguists. Grammars 7, 57–68 (2004)
Craven, M., Kumlien, J.: Constructing Biological Knowledge Bases by Extracting Information from Text Sources. In: Proceedings of the 7th International Conference on Intelligent Systems for Molecular Biology (ISMB 1999) (1999)
Dietteriech, T.G.: Ensemble Methods in Machine Learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000)
Freitag, D.: Using Grammatical Inference to Improve Precision in Information Extraction. In: Workshop on Grammatical Inference, Automata Induction, and Language Acquisition (ICML 1997), Nashville (1997)
Hachey, B., Grover, C., et al.: Use of Ontologies for Cross-lingual Information Management in the Web. In: Proceedings of the Ontologies and Information Extraction International Workshop (EUROLAN 2003), Bucarest, Romania (July 28 - August 8, 2003)
Hahn, U., Romacker, M.: An Integrated Model of Semantic and Conceptual Interpretation from Dependency Structures. In: Proceedings of the 18th Conference on Computational Linguistics, Saarbrücken, Germany, pp. 271–277 (2000)
Hearst, M.A.: Automatic Acquisition of Hyponyms from Large Text Corpora. In: Proceedings of the 14th International Conference on Computational Linguistics, Nantes, France (1992)
Karampatziakis, N., Paliouras, G., Pierrakos, D., Stamatopoulos, P.: Navigation pattern discovery using grammatical inference. In: Paliouras, G., Sakakibara, Y. (eds.) ICGI 2004. LNCS (LNAI), vol. 3264, pp. 187–198. Springer, Heidelberg (2004)
Katrenko, S., Adriaans, P.W.: Learning Relations from Biomedical Corpora Using Dependency Tree Levels. In: Benelearn 2006 (2006)
Kim, J.-D., et al.: Introduction to the Bio-Entity Recognition Task at JNLPBA. In: JNLPBA 2004 (2004)
Kunik, V., Solan, Z., Edelman, S., Ruppin, E., Horn, D.: Motif Extraction and Protein Classification. In: CSB (2005)
Pradhan, S., Haciouglu, K., Ward, W., Martin, J.H., Jurafsky, D.: Semantic Role Chunking Combining Complementary Syntactic Views. In: Proceedings of the 9th Conference on Natural Language Learning (CONNL 2005), Ann Arbor, MI (2005)
Reinberger, M.-L., Spyns, P., Pretorius, A.J., Daelemans, W.: Automatic initiation of an ontology. In: Meersman, R., Tari, Z. (eds.) OTM 2004. LNCS, vol. 3290, pp. 600–617. Springer, Heidelberg (2004)
Roberts, A., Atwell, E.: The Use of Corpora for Automatic Evaluation of Grammar Inference Systems. In: Proceedings of of the Corpus Linguistics 2003 Conference, (2003)
Sigletos, G., Paliouras, G., Spyropoulos, C.D., Hatzopoulos, M.: Voting and Stacked Generalization. In: JMLR, pp. 1751–1782 (2005)
Solan, Z., Ruppin, E., Horn, D., Edelman, S.: Automatic acquisition and efficient representation of syntactic structures. In: NIPS (2002)
Thelen, M., Riloff, E.: A Bootstrapping Method for Learning Semantic Lexicons using Extraction Pattern Contexts. In: Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP) (2002)
Valarakos, A.G., Paliouras, G., Karkaletsis, V., Vouros, G.A.: Enhancing Ontological Knowledge Through Ontology Population and Enrichment. In: Motta, E., Shadbolt, N.R., Stutt, A., Gibbins, N. (eds.) EKAW 2004. LNCS (LNAI), vol. 3257, pp. 144–156. Springer, Heidelberg (2004)
van Zaanen, M., Adriaans, P.: Alignment-Based Learning versus EMILE: A Comparison. In: Proceedings of the Belgian-Dutch Conference on Artificial Intelligence (BNAIC), Amsterdam, The Netherlands, pp. 315–322 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Katrenko, S., Adriaans, P. (2006). Grammatical Inference in Practice: A Case Study in the Biomedical Domain. In: Sakakibara, Y., Kobayashi, S., Sato, K., Nishino, T., Tomita, E. (eds) Grammatical Inference: Algorithms and Applications. ICGI 2006. Lecture Notes in Computer Science(), vol 4201. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11872436_16
Download citation
DOI: https://doi.org/10.1007/11872436_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45264-5
Online ISBN: 978-3-540-45265-2
eBook Packages: Computer ScienceComputer Science (R0)