Skip to main content

Social-Correlation Based Mutual Reinforcement for Short Text Classification and User Interest Tagging

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8346))

Abstract

Short text such as micro-blog messages is becoming increasingly prevalent in China. Due to the sparseness of the features associated with short text, accurately classifying short text and tagging user interest have become important and challenging tasks. Many recent studies have focused on utilizing external data to address the data sparsity issue but fail to leverage the social-correlation which is expected to help improve the accuracy of short text classification. In this paper, we present a new method using a semi-supervised coupled mutual reinforcement framework based on social-correlation to simultaneously classify short text and tag user interest. Specifically, our method requires relatively few labeled examples to initialize the training process. More importantly, experimental results have demonstrated that our method can achieve 100% accuracy in classifying certain categories and significantly improve the accuracy of classifying the other categories. Meanwhile, the experiments show that our model is effective in user interest tagging.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. China Internet Development Statistics Report, 第32次中国互联网络发展状况统计报告, http://www.cnnic.net.cn/hlwfzyj/hlwxzbg/hlwtjbg/201301/P020130724346275579709.pdf

  2. Long, G., Chen, L., Zhu, X.Q., Zhang, C.Q.: TCSST: Transfer Classification of Short & Sparse Text Using External Data. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 764–772. ACM Press, New York (2012)

    Google Scholar 

  3. Pan, X.-H., Nguyen, L.-M., Horiguchi, S.: Learning to Classify Short and Sparse Text & Web with Hidden Topics from Large-Scale Data Collections. In: Proceedings of the 17th International Conference on World Wide Web, pp. 91–100. ACM Press, Beijing (2008)

    Google Scholar 

  4. Dai, Z., Sun, A., Liu, X.-Y.: Crest: Cluster-based Representation Enrichment for Short Text Classification. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013, Part II. LNCS, vol. 7819, pp. 256–267. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  5. Sriram, B., Fuhry, D., Demir, E., Ferhatosmanoglu, H., Demirbas, M.: Short Text Classification in Twitter to Improve Information Filtering. In: Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 841–842. ACM Press, New York (2010)

    Google Scholar 

  6. Hatzivassiloglou, V., Klavans, J.L., Eskin, E.: Detecting Text Similarity over Short Passage: Exploring Linguistic Feature Combinations via Machine Learning. In: Proceedings of the 1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, pp. 203–212. Maryland (1999)

    Google Scholar 

  7. Li, Y.H., Mclean, D., Bandar, Z.A., O’Shea, J.D., Crockett, K.: Sentence Similarity Based on Semantic Nets and Corpus Statistics. IEEE Transactions on Knowledge and Data Engineering 18, 1138–1150 (2006)

    Article  Google Scholar 

  8. Jiang, J.J., Conrath, D.W.: Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy. In: Proceedings of ROCLING X, Taiwan (1997)

    Google Scholar 

  9. Lyon, C., Malcolm, J., Dickerson, B.: Detecting Short Passages of Similar Text in Large Document. In: Proceedings of the 2001 Conference on Empirical Methods in Natural Language Processing, pp. 118–128. Pennsylvania (2001)

    Google Scholar 

  10. Rafeeque, P.C., Sendhikumar, S.: A Survey on Short Text Analysis in Web. In: 2011 Third International Conference on Advanced Computing, Chennai, pp. 365–371 (2011)

    Google Scholar 

  11. Meng, W., Lanfen, L., Jing, W., Penghua, Y., Jiaolong, L., Fei, X.: Improving Short Text Classification Using Public Search Engines. In: Qin, Z., Huynh, V.-N. (eds.) IUKM 2013. LNCS, vol. 8032, pp. 157–166. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  12. Francisco, P.R., Pascual, J.-I., Andres, S., Mateus, F.S., Juan, G.-C.: Classifying Unlabeled Short Texts Using a Fuzzy Declarative Approach. Language Resources and Evaluation 47, 151–178 (2013)

    Article  Google Scholar 

  13. Sarah, Z., Haym, H.: Improving Short Text Classification Using Unlabeled Background Knowledge to Assess Document Similarity. In: Proceedings of the Seventeenth International Conference on Machine Learning, San Francisco, pp. 1183–1190 (2000)

    Google Scholar 

  14. Blum, A., Mitchell, T.: Combining Labeled and Unlabeled Data with Co-training. In: Proceedings of the Eleventh Annual Conference on Computational Learning Theory, pp. 92–100. ACM Press, New York (1998)

    Chapter  Google Scholar 

  15. Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. Wiley (1973)

    Google Scholar 

  16. Yarowsky, D.: Unsupervised Word Sense Disambiguation Rivaling Supervised Methods. In: Proceedings of the 33rd Annual Meeting on Association for Computational Linguistics, pp. 189–196. Pennsylvania (1995)

    Google Scholar 

  17. Bian, J., Liu, Y.D., Zhou, D., Agichtein, E., Zha, H.Y.: Learning to Recognize Reliable Users and Content in Social Media with Coupled Mutual Reinforcement. In: Proceedings of the 18th International Conference on World Wide Web, p. 5 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Li, R., Zhang, Y. (2013). Social-Correlation Based Mutual Reinforcement for Short Text Classification and User Interest Tagging. In: Motoda, H., Wu, Z., Cao, L., Zaiane, O., Yao, M., Wang, W. (eds) Advanced Data Mining and Applications. ADMA 2013. Lecture Notes in Computer Science(), vol 8346. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-53914-5_38

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-53914-5_38

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-53913-8

  • Online ISBN: 978-3-642-53914-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics