Abstract
With the rapid growth of user generated media, Twitter has become an important information resource where users share fresh information on any subject. Pursuing on the problem of finding related tweets to a given organization, we propose two stages based organization name disambiguity. Insufficient information and the diversity of organizations are two key problems for this task. We induce multiple types of features to enrich the information of organization to solve the problem of insufficient information. The relationships between tweets and organization, the relationships among tweets are mined in two stages to solve the diversity of organization. Furthermore, we probe the distribution of organization names’ ambiguity and its influence to different classifiers. Our experimental results on WePS-3 prove the proposed methods are effective and promising in performing this task.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Amigó, E., Artiles, J., Gonzalo, J., Spina, D., Liu, B., Corujo, A.: WePS-3 Evaluation Campaign: Overview of the Online Reputation Management Task. In: 3rd Web People Search Evaluation Workshop (2010)
Yerva, S.R., Miklós, Z., Aberer, K.: It was Easy, when Apples and Blackberries were only Fruits. In: 3rd Web People Search Evaluation Workshop (2010)
Yoshida, M., Matsushima, S., Ono, S., Sato, I., Nakagawa, H.: ITC-UT: Tweet Categorization by Query Categorization for On-line Reputation Management. In: 3rd Web People Search Evaluation Workshop (2010)
Kalmar, P.: Bootstrapping Websites for Classification of Organization Names on Twitter. In: 3rd Web People Search Evaluation Workshop (2010)
García-Cumbreras, M.A., García-Vega, M., Martínez-Santiago, F., Peréa-Ortega, J.M.: SINAI at WePS-3: Online Reputation Management. In: 3rd Web People Search Evaluation Workshop (2010)
Perez-Tellez, F., Pinto, D., Cardiff, J., Rosso, P.: On the Difficulty of Clustering Microblog Texts for Online Reputation Management. In: 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis, ACL-HLT (2011)
Dan, O., Feng, J., Davision, B.D.: A Bootstrapping Approach to Identifying Relevant Tweets for Social TV. In: 5th International AAAI Conference Weblogs and Social Media (2011)
Zhou, G.D., Kong, F.: Global Learning of Noun Phrase Anaphoricity in Coreference Resolution via Label Propagation. In: Empirical Methods in Natural Language Processing, pp. 978–986 (2009)
Niu, Z.Y., Ji, D.H., Tan, C.T.: Word Sense Disambiguation Using Label Propagation Based Semi-Supervised Learning. In: 43rd Annual Meeting on Association for Computational Linguistics, pp. 395–402 (2005)
Chen, J.X., Ji, D.H., Tan, C.T., Niu, Z.Y.: Relation Extraction Using Label Propagation Based Semi-supervised Learning. In: 21st International Conference on Computational Linguistics and 44th Annual Meeting on Association for Computational Linguistics, pp. 129–136 (2006)
Zhu, X., Ghahramani, Z.: Learning from Labeled and Unlabeled Data with Label Propagation. Technical Report CMU-CALD-02-107, Carnegie Mellon University (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, S., Wu, J., Zheng, D., Meng, Y., Xia, Y., Yu, H. (2012). Two Stages Based Organization Name Disambiguity. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2012. Lecture Notes in Computer Science, vol 7181. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28604-9_21
Download citation
DOI: https://doi.org/10.1007/978-3-642-28604-9_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28603-2
Online ISBN: 978-3-642-28604-9
eBook Packages: Computer ScienceComputer Science (R0)