Abstract
Nowadays, we have noticed that the free writing style becomes more and more popular. People tend to use nicknames to replace the original names. However, the traditional named entity recognition does not perform well on the nickname recognition problem. Thus, we chose the automobile domain and accomplished a whole process of Chinese automobiles’ nickname recognition. This paper discusses a new method to tackle the problem of automobile’s nickname recognition in Chinese text. First we have given the nicknames a typical definition. Then we have used methods of machine learning to acquire the probabilities of transition and emission based on our training set. Finally the nicknames are identified through maximum matching on the optimal state sequence. The result revealed that our method can achieve competitive performance in nickname recognition. We got precision 95.2%; recall 91.5% and F-measure 0.9331 on our passages test set. The method will contribute to build a database of nicknames, and could be used in data mining and search engines on automobile domain, etc.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Bick, E.: A Named Entity Recognizer for Danish. In: Proc. of 4th International Conf. on Language Resources and Evaluation (2004)
Zhang, H., Liu, Q.: Automatic Recognition of Chinese Person based on Roles Taging. In: Proc. of 7th Graduate Conference on Computer Science in Chinese Academy of Sciences (2002)
Zhang, H., Liu, Q., Yu, H., Cheng, X., Bai, S.: Chinese Named Entity Recognition Using Role Model. Computational Linguistics and Chinese Language Processing (2003)
Chang, J.T., Schütze, H.: Abbreviations in biomedical text. In: Ananiadou, S., McNaught, J. (eds.) Text Mining for Biology and Biomedicine, pp. 99–119 (2006)
Sun, X., Wang, H., Yuzhang: Chinese Abbreviation-Definition Identification: A SVM Approach Using Context Information. LNCS. Springer, Heidelberg (2006)
Okazaki, N., Ananiadou, S.: A Term Recognition Approach to Acronym Recognition. In: Proceedings of the COLING/ACL Main Conference Poster (2006)
Qin, B., Lang, J., Chen, Y., He, R., Zhao, Y.: The search engine of Chinese automobile’s comments, http://zp.isoche.com/
Pierre, J.M.: Mining knowledge from text collections using automatically generated metadata. In: Karagiannis, D., Reimer, U. (eds.) PAKM 2002. LNCS, vol. 2569, pp. 537–548. Springer, Heidelberg (2002)
Collins, M., Singer, Y.: Unsupervised Models for Named Entity Classification. In: Proc. of EMNLP/VLC 1999 (1999)
Liu, F., Zhao, J., Lv, B., Xu, B., Yu, H.: Product Named Entity Recognition Based on Hierarchical Hidden Markov Model. Journal of Chinese Information Processing (2006)
Andrei, M., Marc, M., Claire, G.: Named Entity Recognition using an HMM-based Chunk Tagger. In: Proc. of EACL (1999)
Bikel, D., Schwarta, R., Weischedel, R.: An algorithm that learns what’s in a name. Machine learning 34, 211–231 (1997)
Luo, Z., Song, R.: Integrated and Fast Recognition of Proper Noun in Modern Chinese Word Segmentation. In: Proceedings of International Conference on Chinese Computing (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, C., Yu, W., Li, W., Xu, Z. (2009). A Novel Method of Automobiles’ Chinese Nickname Recognition. In: Li, W., Mollá-Aliod, D. (eds) Computer Processing of Oriental Languages. Language Technology for the Knowledge-based Economy. ICCPOL 2009. Lecture Notes in Computer Science(), vol 5459. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00831-3_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-00831-3_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00830-6
Online ISBN: 978-3-642-00831-3
eBook Packages: Computer ScienceComputer Science (R0)