Skip to main content

Mining the Personal Interests of Microbloggers via Exploiting Wikipedia Knowledge

  • Conference paper
Book cover Computational Linguistics and Intelligent Text Processing (CICLing 2014)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8404))

Abstract

This paper focuses on an emerging research topic about mining microbloggers’ personalized interest tags from their own microblogs ever posted. It based on an intuition that microblogs indicate the daily interests and concerns of microblogs. Previous studies regarded the microblogs posted by one microblogger as a whole document and adopted traditional keyword extraction approaches to select high weighting nouns without considering the characteristics of microblogs. Given the less textual information of microblogs and the implicit interest expression of microbloggers, we suggest a new research framework on mining microbloggers’ interests via exploiting the Wikipedia, a huge online word knowledge encyclopedia, to take up those challenges. Based on the semantic graph constructed via the Wikipedia, the proposed semantic spreading model (SSM) can discover and leverage the semantically related interest tags which do not occur in one’s microblogs. According to SSM, An interest mining system have implemented and deployed on the biggest microblogging platform (Sina Weibo) in China. We have also specified a suite of new evaluation metrics to make up the shortage of evaluation functions in this research topic. Experiments conducted on a real-time dataset demonstrate that our approach outperforms the state-of-the-art methods to identify microbloggers’ interests.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Brown, P.F., Pietra, S.A.D., Pietra, V.J.D., Mercer, R.L.: The mathematics of statistical machine translation: parameter estimation. Computational Linguistics 19(2), 263–311 (1993)

    Google Scholar 

  2. Bu, F., Hao, Y., Zhu, X.: Semantic relationship discovery with Wikipedia structure. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence, pp. 1770–1775 (2011)

    Google Scholar 

  3. Chen, K., Chen, T., Zheng, G., Jin, O., Yao, E., Yu, Y.: Collaborative personalized tweet recommendation. In: Proceedings of the 35th Annual International Conference on Research and Development in Information Retrieval, pp. 661–670 (2012)

    Google Scholar 

  4. Efron, M.: Hashtag retrieval in a microblogging environment. In: Proceedings of the 33rd Annual International Conference on Research and Development in Information Retrieval, pp. 787–788 (2010)

    Google Scholar 

  5. Gabrilvich, E., Markovitch, S.: Computing Semantic Relatedness using Wikipedia-based Explicit Semantic Analysis. In: IJCAI 2007, pp. 1606–1610 (2007)

    Google Scholar 

  6. Kwak, H., Lee, C., Park, H., Moon, S.: What is Twitter, a social network or a news media? In: Proceedings of the 19th International Conference on World Wide Web, pp. 591–600 (2010)

    Google Scholar 

  7. Gupta, M., Li, R., Yin, Z., Han, J.: Survey on social tagging techniques. In: SIGKDD Explor., pp. 58–72 (2010)

    Google Scholar 

  8. Hu, J., Fang, L., Cao, Y., Zeng, H.-J., Li, H., Yang, Q., Chen, Z.: Enhancing text clustering by leveraging Wikipedia semantics. In: Proceedings of the 31st Annual International Conference on Research and Development in Information Retrieval, pp. 179–186 (2008)

    Google Scholar 

  9. Hu, J., Wang, G., Lochovsky, F., Sun, J., Chen, Z.: Understanding use’s query intent with Wikipedia. In: Proceedings of the 18th World Wide Web Conference, pp. 471–478 (2009)

    Google Scholar 

  10. Jiang, L., Yu, M., Zhou. M., Liu, X., Zhao, T. : Target-dependent twitter sentiment classification. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, pp. 151–160 (2011)

    Google Scholar 

  11. Liu, Z., Chen, X., Sun, M.: Mining the interests of Chinese microbloggers via keyword extraction. Front. Comput. Sci. 6(1), 76–87 (2012)

    MathSciNet  Google Scholar 

  12. Mihalcea, R., Tarau, P.: Textrank: Bringing order into texts. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, Barcelona, Spain (2004)

    Google Scholar 

  13. Petrovic, S., Osborne, M., Lavrendo, V.: Streaming first story detection with application to Twitter. In: Proceedings of the North American Chapter of the ACL, pp. 181–189 (2010)

    Google Scholar 

  14. Sakaki, T., Okazaki, M., Matsuo, Y.: Earthquake shakes Twitter users: real-time event detection by social sensors. In: Proceedings of the 19th World Wide Web Conference, pp. 851–860 (2010)

    Google Scholar 

  15. Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Information Processing and Management 24(5), 513–523 (1988)

    Article  Google Scholar 

  16. Schonhofen, P.: Identifying document topics using the Wikipedia category network. In: Web Intell. Agent Syst., pp. 456–462 (2006)

    Google Scholar 

  17. Sowa, J.: Semantics of conceptual graphs. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics, pp. 39–44 (1979)

    Google Scholar 

  18. Strube, M., Ponzetto, S.P.: Wikirelate! Computing semantic relatedness using Wikipedia. In: Proceedings of the 21st National Conference on Artificial Intelligence, Boston, MA (2006)

    Google Scholar 

  19. Wang, P., Hu, J., Zeng, H.-J., Chen, Z.: Using Wikipedia knowledge to improve text classification. Knowl. Inf. Syst. 19, 265–281 (2009)

    Article  Google Scholar 

  20. Wu, W., Zhang, B., Ostendorf, M.: Automatic generation of personalized annotation tags for twitter users. In: Proceedings of the North American Chapter of the ACL, pp. 689–692 (2010)

    Google Scholar 

  21. Yu, J., Thom, J., Tam, A.: Ontology evaluation using Wikipedia categories for browsing. In: Proceedings of the 6th ACM Conference on Information and Knowledge Management, pp. 223–232 (2007)

    Google Scholar 

  22. Zhang, W., Wang, D., Xue, G.-R., Zha, H.: Advertising keywords recommendation for short-text Web pages using Wikipeda. ACM Trans. Intell. Syst. Technol. 3(2), Article 36, 25 pages (February 2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Fan, M., Zhou, Q., Zheng, T.F. (2014). Mining the Personal Interests of Microbloggers via Exploiting Wikipedia Knowledge. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2014. Lecture Notes in Computer Science, vol 8404. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-54903-8_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-54903-8_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-54902-1

  • Online ISBN: 978-3-642-54903-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics