Abstract
This paper introduces a Mutual Information Independence Model (MIIM) and proposes a feature relaxation principle to resolve the data sparseness problem in MIIM-based named entity recognition via hierarchical features. In this way, a named entity recognition system with better performance and better portability can be achieved. Evaluation of our system on MUC-6 and MUC-7 English named entity tasks achieves F-measures of 96.1% and 93.7% respectively. It also shows that 20K words of training data would have given the performance of 90 percent with the hierarchical structure in the features compared with 30K words without the hierarchical structure in the features. This suggests that the hierarchical features provide a potential for much better portability.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Chinchor, N.: MUC-6 Named Entity Task Definition (Version 2.1). In: Proceedings of the Sixth Message Understanding Conference (MUC-6), Columbia, Maryland (1995a)
Chinchor, N.: MUC-7 Named Entity Task Definition (Version 3.5). In: Proceedings of the Seventh Message Understanding Conference (MUC-7), Fairfax, Virginia (1998a)
Aone, C., Halverson, L., Hampton, T., Ramos-Santacruz, M.: SRA: Description of the IE2 System Used for MUC-7. In: Proceedings of the Seventh Message Understanding Conference (MUC-7), Fairfax, Virginia (1998)
Krupka, G.R., Hausman, K.: IsoQuest Inc.: Description of the NetOwlTM Extractor System as Used for MUC-7. In: Proceedings of the Seventh Message Understanding Conference (MUC-7), Fairfax, Virginia (1998)
Mikheev, A., Grover, C., Moens, M.: Description of the LTG System Used for MUC-7. In: Proceedings of the Seventh Message Understanding Conference (MUC-7), Fairfax, Virginia (1998)
Mikheev, A., Moens, M., Grover, C.: Named entity recognition without gazeteers. In: Proceedings of the Ninth Conference the European Chapter of the Association for Computational Linguistics (EACL 1999), Bergen, Norway, pp. 1–8 (1999)
Miller, S., Crystal, M., Fox, H., Ramshaw, L., Schwartz, R., Stone, R., Weischedel, R., The Annotation Group: BBN: Description of the SIFT System as Used for MUC-7. In: Proceedings of the Seventh Message Understanding Conference (MUC-7), Fairfax, Virginia (1998)
Bikel, D.M., Schwartz, R., Weischedel, R.M.: An Algorithm that Learns What’s in a Name. In: Machine Learning (Special Issue on NLP) (1999)
GuoDong, Z., Jain, S.: Named Entity Recognition Using a HMM-based Chunk Tagger. In: Proceedings of the fortieth Annual Meeting of the Association for Computational Linguistics (ACL 2002), Philadelphia (2002)
Borthwick, A., Sterling, J., Agichtein, E., Grishman, R.: NYU: Description of the MENE Named Entity System as Used in MUC-7. In: Proceedings of the Seventh Message Understanding Conference (MUC-7). Fairfax, Virginia. (1998)
Borthwick, A.: A Maximum Entropy Approach to Named Entity Recognition. Ph.D. Thesis. New York University (1999)
Leong, C.H., Tou, N.H.: Named Entity Recognition: A Maximum Entropy Approach Using Global Information. In: Proceedings of the 19th International Conference on Computational Linguistics (COLING 2002), Taipei, pp. 190–196 (2002)
Bennett, S.W., Aone, C., Lovell, C.: Learning to Tag Multilingual Texts Through Observation. In: Proceedings of the First Conference on Empirical Methods on Natural Language Processing (EMNLP 1996), Providence, Rhode Island (1996)
Zhang, T., Johnson, D.: A Robust Risk Minimization based Named Entity Recognition System. In: Proceedings of CoNLL 2003, Edmonton, Canada, pp. 204–207 (2003)
Klein, D., Smarr, J., Nguyen, H., Manning, C.D.: Named Entity Recognition with Character-Level Models. In: Proceedings of CoNLL 2003, Edmonton, Canada, pp. 180–183 (2003)
McCallum, A., Li, W.: Early results for Named Entity Recognition with Conditional Random Fields, Feature Induction and Web-Enhanced Lexicons. In: Proceedings of CoNLL 2003, Edmonton, Canada, pp. 188–191 (2003)
Viterbi, A.J.: Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm. IEEE Transactions on Information Theory IT(13), 260–269 (1967)
McCallum, A., Freitag, D., Pereira, F.: Maximum entropy Markov models for information extraction and segmentation. In: ICML-19, Stanford, California, pp. 591–598 (2000)
Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: ICML-20 (2001)
Chen, Goodman: An Empirical Study of Smoothing Technniques for Language Modeling. In: Proceedings of the 34th Annual Meeting of the Association of Computational Linguistics (ACL 1996), Santa Cruz, California, USA, pp. 310–318 (1996)
Jelinek, F.: Self-Organized Language Modeling for Speech Recognition. In: Waibel, A., Lee, K.-F. (eds.) Readings in Speech Recognition, pp. 450–506. Morgan Kaufmann, San Francisco (1989)
Katz, S.M.: Estimation of Probabilities from Sparse Data for the Language Model Component of a Speech Recognizer. IEEE Transactions on Acoustics, Speech and Signal Processing 35, 400–401 (1987)
Collins, M., Brooks, J.: Prepositional Phrase Attachment through a Backed-Off Model. In: Proceedings of the Third Workshop on Very Large Corpora (1995)
Roth, D., Zelenko, D.: Part of Speech Tagging Using a Network of Linear Separators. In: COLING-ACL 1998, Montreal, Canada, pp. 1136–1142 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhou, G., Su, J., Yang, L. (2005). Resolution of Data Sparseness in Named Entity Recognition Using Hierarchical Features and Feature Relaxation Principle. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2005. Lecture Notes in Computer Science, vol 3406. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30586-6_84
Download citation
DOI: https://doi.org/10.1007/978-3-540-30586-6_84
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-24523-0
Online ISBN: 978-3-540-30586-6
eBook Packages: Computer ScienceComputer Science (R0)