Mining the Personal Interests of Microbloggers via Exploiting Wikipedia Knowledge

Fan, Miao; Zhou, Qiang; Zheng, Thomas Fang

doi:10.1007/978-3-642-54903-8_16

Miao Fan¹⁷,
Qiang Zhou¹⁷ &
Thomas Fang Zheng¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8404))

Included in the following conference series:

International Conference on Intelligent Text Processing and Computational Linguistics

1721 Accesses
8 Citations

Abstract

This paper focuses on an emerging research topic about mining microbloggers’ personalized interest tags from their own microblogs ever posted. It based on an intuition that microblogs indicate the daily interests and concerns of microblogs. Previous studies regarded the microblogs posted by one microblogger as a whole document and adopted traditional keyword extraction approaches to select high weighting nouns without considering the characteristics of microblogs. Given the less textual information of microblogs and the implicit interest expression of microbloggers, we suggest a new research framework on mining microbloggers’ interests via exploiting the Wikipedia, a huge online word knowledge encyclopedia, to take up those challenges. Based on the semantic graph constructed via the Wikipedia, the proposed semantic spreading model (SSM) can discover and leverage the semantically related interest tags which do not occur in one’s microblogs. According to SSM, An interest mining system have implemented and deployed on the biggest microblogging platform (Sina Weibo) in China. We have also specified a suite of new evaluation metrics to make up the shortage of evaluation functions in this research topic. Experiments conducted on a real-time dataset demonstrate that our approach outperforms the state-of-the-art methods to identify microbloggers’ interests.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Brown, P.F., Pietra, S.A.D., Pietra, V.J.D., Mercer, R.L.: The mathematics of statistical machine translation: parameter estimation. Computational Linguistics 19(2), 263–311 (1993)
Google Scholar
Bu, F., Hao, Y., Zhu, X.: Semantic relationship discovery with Wikipedia structure. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence, pp. 1770–1775 (2011)
Google Scholar
Chen, K., Chen, T., Zheng, G., Jin, O., Yao, E., Yu, Y.: Collaborative personalized tweet recommendation. In: Proceedings of the 35th Annual International Conference on Research and Development in Information Retrieval, pp. 661–670 (2012)
Google Scholar
Efron, M.: Hashtag retrieval in a microblogging environment. In: Proceedings of the 33rd Annual International Conference on Research and Development in Information Retrieval, pp. 787–788 (2010)
Google Scholar
Gabrilvich, E., Markovitch, S.: Computing Semantic Relatedness using Wikipedia-based Explicit Semantic Analysis. In: IJCAI 2007, pp. 1606–1610 (2007)
Google Scholar
Kwak, H., Lee, C., Park, H., Moon, S.: What is Twitter, a social network or a news media? In: Proceedings of the 19th International Conference on World Wide Web, pp. 591–600 (2010)
Google Scholar
Gupta, M., Li, R., Yin, Z., Han, J.: Survey on social tagging techniques. In: SIGKDD Explor., pp. 58–72 (2010)
Google Scholar
Hu, J., Fang, L., Cao, Y., Zeng, H.-J., Li, H., Yang, Q., Chen, Z.: Enhancing text clustering by leveraging Wikipedia semantics. In: Proceedings of the 31st Annual International Conference on Research and Development in Information Retrieval, pp. 179–186 (2008)
Google Scholar
Hu, J., Wang, G., Lochovsky, F., Sun, J., Chen, Z.: Understanding use’s query intent with Wikipedia. In: Proceedings of the 18th World Wide Web Conference, pp. 471–478 (2009)
Google Scholar
Jiang, L., Yu, M., Zhou. M., Liu, X., Zhao, T. : Target-dependent twitter sentiment classification. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, pp. 151–160 (2011)
Google Scholar
Liu, Z., Chen, X., Sun, M.: Mining the interests of Chinese microbloggers via keyword extraction. Front. Comput. Sci. 6(1), 76–87 (2012)
MathSciNet Google Scholar
Mihalcea, R., Tarau, P.: Textrank: Bringing order into texts. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, Barcelona, Spain (2004)
Google Scholar
Petrovic, S., Osborne, M., Lavrendo, V.: Streaming first story detection with application to Twitter. In: Proceedings of the North American Chapter of the ACL, pp. 181–189 (2010)
Google Scholar
Sakaki, T., Okazaki, M., Matsuo, Y.: Earthquake shakes Twitter users: real-time event detection by social sensors. In: Proceedings of the 19th World Wide Web Conference, pp. 851–860 (2010)
Google Scholar
Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Information Processing and Management 24(5), 513–523 (1988)
Article Google Scholar
Schonhofen, P.: Identifying document topics using the Wikipedia category network. In: Web Intell. Agent Syst., pp. 456–462 (2006)
Google Scholar
Sowa, J.: Semantics of conceptual graphs. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics, pp. 39–44 (1979)
Google Scholar
Strube, M., Ponzetto, S.P.: Wikirelate! Computing semantic relatedness using Wikipedia. In: Proceedings of the 21st National Conference on Artificial Intelligence, Boston, MA (2006)
Google Scholar
Wang, P., Hu, J., Zeng, H.-J., Chen, Z.: Using Wikipedia knowledge to improve text classification. Knowl. Inf. Syst. 19, 265–281 (2009)
Article Google Scholar
Wu, W., Zhang, B., Ostendorf, M.: Automatic generation of personalized annotation tags for twitter users. In: Proceedings of the North American Chapter of the ACL, pp. 689–692 (2010)
Google Scholar
Yu, J., Thom, J., Tam, A.: Ontology evaluation using Wikipedia categories for browsing. In: Proceedings of the 6th ACM Conference on Information and Knowledge Management, pp. 223–232 (2007)
Google Scholar
Zhang, W., Wang, D., Xue, G.-R., Zha, H.: Advertising keywords recommendation for short-text Web pages using Wikipeda. ACM Trans. Intell. Syst. Technol. 3(2), Article 36, 25 pages (February 2012)
Google Scholar

Download references

Author information

Authors and Affiliations

CSLT, Division of Technical Innovation and Development, Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science and Technology, Tsinghua University, Beijing, 100084, China
Miao Fan, Qiang Zhou & Thomas Fang Zheng

Authors

Miao Fan
View author publications
You can also search for this author in PubMed Google Scholar
Qiang Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Fang Zheng
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Center for Computing Research, National Polytechnic Institute, Av. Juan Dios Bátiz, Col. Nueva Industrial Vallejo, 07738, Mexico D.F, Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fan, M., Zhou, Q., Zheng, T.F. (2014). Mining the Personal Interests of Microbloggers via Exploiting Wikipedia Knowledge. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2014. Lecture Notes in Computer Science, vol 8404. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-54903-8_16

Download citation

DOI: https://doi.org/10.1007/978-3-642-54903-8_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-54902-1
Online ISBN: 978-3-642-54903-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics