Social-Correlation Based Mutual Reinforcement for Short Text Classification and User Interest Tagging

Li, Rong; Zhang, Ya

doi:10.1007/978-3-642-53914-5_38

Social-Correlation Based Mutual Reinforcement for Short Text Classification and User Interest Tagging

Rong Li²⁵ &
Ya Zhang²⁵

Conference paper

2380 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8346))

Abstract

Short text such as micro-blog messages is becoming increasingly prevalent in China. Due to the sparseness of the features associated with short text, accurately classifying short text and tagging user interest have become important and challenging tasks. Many recent studies have focused on utilizing external data to address the data sparsity issue but fail to leverage the social-correlation which is expected to help improve the accuracy of short text classification. In this paper, we present a new method using a semi-supervised coupled mutual reinforcement framework based on social-correlation to simultaneously classify short text and tag user interest. Specifically, our method requires relatively few labeled examples to initialize the training process. More importantly, experimental results have demonstrated that our method can achieve 100% accuracy in classifying certain categories and significantly improve the accuracy of classifying the other categories. Meanwhile, the experiments show that our model is effective in user interest tagging.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

China Internet Development Statistics Report, 第32次中国互联网络发展状况统计报告, http://www.cnnic.net.cn/hlwfzyj/hlwxzbg/hlwtjbg/201301/P020130724346275579709.pdf
Long, G., Chen, L., Zhu, X.Q., Zhang, C.Q.: TCSST: Transfer Classification of Short & Sparse Text Using External Data. In: Proceedings of the 21st ACM International Conference on Information and Knowledge Management, pp. 764–772. ACM Press, New York (2012)
Google Scholar
Pan, X.-H., Nguyen, L.-M., Horiguchi, S.: Learning to Classify Short and Sparse Text & Web with Hidden Topics from Large-Scale Data Collections. In: Proceedings of the 17th International Conference on World Wide Web, pp. 91–100. ACM Press, Beijing (2008)
Google Scholar
Dai, Z., Sun, A., Liu, X.-Y.: Crest: Cluster-based Representation Enrichment for Short Text Classification. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013, Part II. LNCS, vol. 7819, pp. 256–267. Springer, Heidelberg (2013)
Chapter Google Scholar
Sriram, B., Fuhry, D., Demir, E., Ferhatosmanoglu, H., Demirbas, M.: Short Text Classification in Twitter to Improve Information Filtering. In: Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 841–842. ACM Press, New York (2010)
Google Scholar
Hatzivassiloglou, V., Klavans, J.L., Eskin, E.: Detecting Text Similarity over Short Passage: Exploring Linguistic Feature Combinations via Machine Learning. In: Proceedings of the 1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, pp. 203–212. Maryland (1999)
Google Scholar
Li, Y.H., Mclean, D., Bandar, Z.A., O’Shea, J.D., Crockett, K.: Sentence Similarity Based on Semantic Nets and Corpus Statistics. IEEE Transactions on Knowledge and Data Engineering 18, 1138–1150 (2006)
Article Google Scholar
Jiang, J.J., Conrath, D.W.: Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy. In: Proceedings of ROCLING X, Taiwan (1997)
Google Scholar
Lyon, C., Malcolm, J., Dickerson, B.: Detecting Short Passages of Similar Text in Large Document. In: Proceedings of the 2001 Conference on Empirical Methods in Natural Language Processing, pp. 118–128. Pennsylvania (2001)
Google Scholar
Rafeeque, P.C., Sendhikumar, S.: A Survey on Short Text Analysis in Web. In: 2011 Third International Conference on Advanced Computing, Chennai, pp. 365–371 (2011)
Google Scholar
Meng, W., Lanfen, L., Jing, W., Penghua, Y., Jiaolong, L., Fei, X.: Improving Short Text Classification Using Public Search Engines. In: Qin, Z., Huynh, V.-N. (eds.) IUKM 2013. LNCS, vol. 8032, pp. 157–166. Springer, Heidelberg (2013)
Chapter Google Scholar
Francisco, P.R., Pascual, J.-I., Andres, S., Mateus, F.S., Juan, G.-C.: Classifying Unlabeled Short Texts Using a Fuzzy Declarative Approach. Language Resources and Evaluation 47, 151–178 (2013)
Article Google Scholar
Sarah, Z., Haym, H.: Improving Short Text Classification Using Unlabeled Background Knowledge to Assess Document Similarity. In: Proceedings of the Seventeenth International Conference on Machine Learning, San Francisco, pp. 1183–1190 (2000)
Google Scholar
Blum, A., Mitchell, T.: Combining Labeled and Unlabeled Data with Co-training. In: Proceedings of the Eleventh Annual Conference on Computational Learning Theory, pp. 92–100. ACM Press, New York (1998)
Chapter Google Scholar
Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. Wiley (1973)
Google Scholar
Yarowsky, D.: Unsupervised Word Sense Disambiguation Rivaling Supervised Methods. In: Proceedings of the 33rd Annual Meeting on Association for Computational Linguistics, pp. 189–196. Pennsylvania (1995)
Google Scholar
Bian, J., Liu, Y.D., Zhou, D., Agichtein, E., Zha, H.Y.: Learning to Recognize Reliable Users and Content in Social Media with Coupled Mutual Reinforcement. In: Proceedings of the 18th International Conference on World Wide Web, p. 5 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Shanghai Key Laboratory of Multimedia Processing and Transmissions, Shanghai Jiao Tong University, Shanghai, China
Rong Li & Ya Zhang

Authors

Rong Li
View author publications
You can also search for this author in PubMed Google Scholar
Ya Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

US Air Force Office of Scientific Research, 106-0032, Tokyo, Japan
Hiroshi Motoda
School of Computer Science and Technology, Zhejiang University, 310027, Hangzhou, China
Zhaohui Wu
Faculty of Engineering and Information Technology, University of Technology, Chippendale, 2008, Sydney, NSW, Australia
Longbing Cao
Department of Computing Science, University of Alberta, T6G 2E8, Edmonton, Canada
Osmar Zaiane
College of Computer Science and Technology, Zhejiang University, Hangzhou, China
Min Yao
School of Computer Science, Fudan University, 200433, Shanghai, China
Wei Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, R., Zhang, Y. (2013). Social-Correlation Based Mutual Reinforcement for Short Text Classification and User Interest Tagging. In: Motoda, H., Wu, Z., Cao, L., Zaiane, O., Yao, M., Wang, W. (eds) Advanced Data Mining and Applications. ADMA 2013. Lecture Notes in Computer Science(), vol 8346. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-53914-5_38

Download citation

DOI: https://doi.org/10.1007/978-3-642-53914-5_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-53913-8
Online ISBN: 978-3-642-53914-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics