Abstract
This paper discusses an algorithm for identifying semantic arguments of a verb, word senses of a polysemous word, noun phrases in a sentence. The heart of the algorithm is a probabilistic graphical model. In contrast with other existed graphical models, such as Naive Bayes models, CRFs, HMMs, and MEMMs, this model determines a sequence of optimal class assignments among M choices for a sequence of N input symbols without using dynamic programming, running fast–O(MN), and taking less memory space–O(M). Experiments conducted on standard data sets show encourage results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Molina, A., Pla, F., Hammerton, J., Osborne, M., Armstrong, S., Daelemans, W.: Shallow parsing using specialized hmms. Journal of Machine Learning Research 2, 595–613 (2002)
MaCallum, A., Freitag, D., Pereira, F.: Maximum entropy markov models for information extraction and segmentation. In: Proceedings of 17th International Conf. on Machine Learning, pp. 591–598 (2000)
Lafferty, J., MaCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proceedings of 18th International Conf. on Machine Learning, pp. 282–289 (2001)
Weischedel, R., Palmer, M., Marcus, M., Hovy, E.: Ontonotes release 2.0 with ontonotes db tool v. 0.92 beta and ontoviewer v.0.9 beta (2007), http://www.bbn.com/NLP/OntoNotes
Leacock, C., Towell, G., Voorhees, E.: Corpus based statistical sense resolution. In: Proceedings of the Workshop on Human Language Technology, pp. 260–265 (1993)
Bruce, R., Wiebe, J.: Word-sense disambiguation using decomposable models. In: Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics, pp. 139–146 (1994)
Marcus, M.P., Santorini, B., Marcinkiewicz, M.A.: Building a large annotated corpus of english: The penn treebank. Computational Linguistics 19(2), 313–330 (1994)
Tjong, E.F., Sang, K.: Introduction to the CoNLL-2000 Shared Task: Chunking. In: Proceedings of CoNLL 2000, pp. 127–132 (2000)
Levin, E., Sharifi, M., Ball, J.: Evaluation of utility of lsa for word sense discrimination. In: Preceedings of HLT-NAACL, pp. 77–80 (2006)
Sha, F., Fereira, F.: Shallow parsing with conditional random fields. In: Proceedings of HLT-NAACL, pp. 213–220 (2003)
Carreras, X., Márquez, L.: Phrase recognition by filtering and ranking with perceptrons. In: The International Conference on Recent Advances on Natural Language Processing (2003)
Wu, W.-C., Lee, Y.S., Yang, J.C.: Robust and efficient multiclass svm models for phrase pattern recognition. Pattern Recognition 41, 2874–2889 (2008)
Veenstra, J., den BoschJ, A.V.: Single-classifier memory-based phrase chunking. In: Preceedings of CoNLL 2000 and LLL 2000, pp. 157–159 (2000)
Huang, M., Haralick, R.M.: Recognizing Patterns in Texts. River (2010)
Church, K.W.: A stochastic parts program and noun phrase parser for unrestricted text. In: Proceedings of the Second Conference on Applied Natural Language Processing, pp. 136–143 (1988)
Ramshaw, L.A., Marcus, M.P.: Text Chunking Using Transformation-Based Learning. In: Proceedings of the Third Workshop on Very Large Corpora, pp. 82–94 (1995)
Abney, S., Abney, S.P.: Parsing by chunks. In: Principle-Based Parsing, pp. 257–278. Kluwer Academic Publishers (1991)
Hearst, M.A.: Noun homograph disambiguation using local context in large text corpora. In: Proceedings of the Seventh Annual Conference of the UW centre for the New OED and Text Research, pp. 1–22 (1991)
Gale, W., Church, K., Yarowsky, D.: A method for disambiguating word senses in a large corpus. In: Computers and the Humanities, pp. 415–439 (1992)
Leacock, C., Miller, G.A., Chodorow, M.: Using corpus statistics and wordnet relations for sense identification. Computational Linguist. 24, 147–165 (1998)
Yarowsky, D.: Decision lists for lexical ambiguity resolution: Application to accent restoration in spanish and frech. In: Preceedings of the 32nd Annual Meeting (1994)
Gildea, D., Jurafsky, D.: Automatic labelling of semantic roles. Computational Linguistics, 245–288 (2002)
Baldewein, U., Erk, K., Padó, S., Prescher, D.: Semantic role labeling with chunk sequences. In: Proceedings of CoNLL-2004 Shared Task (2004)
Cohn, T., Blunsom, P.: Semantic role labelling with tree conditional random fields. In: Proceedings of CoNLL 2005 Shared Task (2005)
Hacioglu, K.: A semantic chunking model based on tagging. In: Proceedings of HLT/NACCL 2004 (2004)
Hacioglu, K.: Semantic role labeling using dependency trees. In: Proceedings of Coling 2004, Geneva, Switzerland, COLING, August 23-27, pp. 1273–1276 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Huang, M., Haralick, R.M. (2012). Developing an Algorithm for Mining Semantics in Texts. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2012. Lecture Notes in Computer Science, vol 7182. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28601-8_21
Download citation
DOI: https://doi.org/10.1007/978-3-642-28601-8_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28600-1
Online ISBN: 978-3-642-28601-8
eBook Packages: Computer ScienceComputer Science (R0)