Inferring User Profile Using Microblog Content and Friendship Network

Zhao, Zhishan; Du, Jiachen; Gao, Qinghong; Gui, Lin; Xu, Ruifeng

doi:10.1007/978-981-10-6805-8_3

Zhishan Zhao¹⁵,
Jiachen Du¹⁵,
Qinghong Gao¹⁵,
Lin Gui¹⁶ &
…
Ruifeng Xu^15,17

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 774))

Included in the following conference series:

Chinese National Conference on Social Media Processing

1959 Accesses
2 Citations

Abstract

With the rapid development of microblogs in recent years, accurate prediction of microblog user profiles is valuable for marketing, personalized recommendation, and legal investigation. Microblog users post rich contents everyday and build a complex friendship network with “following” behaviors. Both of user-generated content and friendship network are crucial for user profiling. In this work, we propose a neural-network based model for user profiling. It takes advantages of both user-generated content and friendship network with attentional multi-scale convolutional neural networks and graph embeddings. We evaluate our model on SMP CUP 2016 dataset whose task is to infer age, gender and region of microblog users. The experiment results show that utilizing information from user generated content and friend network, our method obtains the state-of-the-art performance on all of three sub-tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

NLPCC 2018 Shared Task User Profiling and Recommendation Method Summary by DUTIR_9148

Deep Approach Based on User’s Profile Analysis for Capturing User’s Interests

Forum User Profiling by Incorporating User Behavior and Social Network Connections

Notes

1.
https://biendata.com/competition/smpcup2016/.
2.
http://pyltp.readthedocs.io/zh_CN/latest/.
3.
As training data is insufficient, the model is difficult to learn how to map a specific location that shows in user-generated content to its belonging region. Hence, we construct a region dictionary using geography knowledge and Sina Weibo location information to help our model find the relation between location and region.
4.
https://nlp.stanford.edu/projects/glove/.
5.
https://github.com/tangjianpku/LINE.

References

Ciot, M., Sonderegger, M., Ruths, D.: Gender inference of twitter users in nonEnglish contexts. In: Proceedings of EMNLP, pp. 18–21 (2013)
Google Scholar
Wendy, L., Derek, R.: What’s in a name? Using first names as features for gender inference in twitter. In: AAAI Spring Symposium Series (2013)
Google Scholar
Liu, W., Zamal, F.A., Ruths, D.: Using social media to infer gender composition of commuter populations. In: Proceedings of the International Conference on Weblogs and Social Media (2102)
Google Scholar
Rao, D., Yarowsky, D.: Detecting latent user properties in social media. In: Proceedings of the NIPS MLSN Workshop (2010)
Google Scholar
Pennacchiotti, M., Popescu, A.M.: A machine learning approach to twitter user classification. In: Proceedings of ICWSM (2011)
Google Scholar
Conover, M.D., Ratkiewicz, J., Francisco, M., et al.: Political polarization on twitter. In: Proceedings of ICWSM (2011)
Google Scholar
Tu, C., Liu, Z., Sun, M.: PRISM: Profession Identification in Social Media with personal information and community structure. In: Proceedings of Social Media Processing (2015)
Google Scholar
Rao, D., Yarowsky, D., Shreevats, A., Gupta, M.: Classifying latent user attributes in twitter. In: Proceedings of the 2nd International Workshop on Search and Mining User-Generated Contents, pp. 37–44 (2010)
Google Scholar
Rosenthal, S., McKeown, K.: Age prediction in blogs: a study of style, content, and online behavior in pre-and post-social media generations. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, Human Language Technologies, vol. 1, pp. 763–772 (2011)
Google Scholar
Nguyen, D., Smith, N.A., Rosé, C.P.: Author age prediction from text using linear regression. In: Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, pp. 115–123 (2011)
Google Scholar
Burger, J.D., Henderson, J., Kim, G., Zarrella, G.: Discriminating gender on twitter. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1301–1309 (2011)
Google Scholar
Al Zamal, F., Liu, W., Ruths, D.: Homophily and latent attribute inference: inferring latent attributes of twitter users from neighbors. In: Proceedings of ICWSM (2012)
Google Scholar
Lim, K.H., Datta, A.: Finding twitter communities with common interests using following links of celebrities. In: Proceedings of the 3rd International Workshop on Modeling Social Media, pp. 25–32 (2012)
Google Scholar
Tu, C., Liu, Z., Sun, M.: Inferring correspondences from multiple sources for microblog user tags. In: Huang, H., Liu, T., Zhang, H.-P., Tang, J. (eds.) SMP 2014. CCIS, vol. 489, pp. 1–12. Springer, Heidelberg (2014). doi:10.1007/978-3-662-45558-6_1
Google Scholar
Gui, L., Xu, R, He, Y., Lu, Q., Wei, Z.: Intersubjectivity and Sentiment: from Language to Knowledge. In: Proceedings of 25th International Joint Conference on Artificial Intelligence (IJCAI) (2016)
Google Scholar
Gui, L., Zhou, Y., Xu, R., He, Y., Lu, Q.: Learning representations from heterogeneous network for sentiment classification of product reviews. In: Proceedings of Knowledge-Based Systems, pp. 34–45 (2017)
Google Scholar
Yan, X., Yan, L.: Gender classification of weblog authors. In: Proceedings of the Association for the Advancement of Artificial Intelligence. Computational Approaches to Analyzing Weblogs (2006)
Google Scholar
Tuv, E., Borisov, A., Runger, G., Torkkola, K.: Feature selection with ensembles, artificial variables, and redundancy elimination. Proc. J. Mach. Learn. Res. 10, 1341–1366 (2009)
MATH MathSciNet Google Scholar
Houvardas, J., Stamatatos, E.: N-gram feature selection for authorship identification. In: Proceedings of the 12th International Conference on Artificial Intelligence: Methodology, Systems, Applications, pp. 77–86 (2006)
Google Scholar
Schler, J., Koppel, M., Argamon, S., Pennebaker, J.: Effects of age and gender on blogging. In: Proceedings of the Association for the Advancement of Artificial Intelligence Spring Symposium Computational Approaches to Analyzing Weblogs (2006)
Google Scholar
Eisenstein, J., O’Connor, B., Smith, N.A., et al.: A latent variable model for geographic lexical variation. In: Proceedings of Empirical Methods in Natural Language Processing. Association for Computational Linguistics, pp. 1277–1287 (2010)
Google Scholar
Mukherjee, A., Liu, B.: Improving gender classification of blog authors. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, Cambridge, MA. Association for Computational Linguistics, October 2010
Google Scholar
Rao, D., Fink, C., Oates, T.: Hierarchical Bayesian models for latent attribute detection in social media. In: Proceedings of the 5th International Conference in Weblogs and Social Media (2011)
Google Scholar
Sun, X., Guo, J., Ding, X., Liu, T.: A general framework for content-enhanced network representation learning. arXiv preprint (2016)
Google Scholar
LeCun, Y., Boser, B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., Jackel, L.D.: Handwritten digit recognition with a backpropagation network. In: Proceedings of NIPS (1989)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. In: Proceedings of NIPS (2012)
Google Scholar
Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1746–1751 (2014)
Google Scholar
Collobert, R., Weston, J., Bottou, L., et al.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12(8), 2493–2537 (2011)
MATH Google Scholar
Tang, J., Qu, M., Wang, M., et al.: LINE: Large-scale Information Network Embedding. In: Proceedings of the 24th International Conference on World Wide Web, pp. 1067–1077 (2015)
Google Scholar
Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. In: Proceedings of the 31st International Conference on Machine Learning (2014)
Google Scholar
van der Laurens, M., Hinton, G.: Visualizing data using t-SNE. Proc. J. Mach. Learn. Res. 9, 2579–2605 (2008)
MATH Google Scholar

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China 61370165, U1636103, 61632011, Shenzhen Foundational Research Funding JCYJ20150625142543470, JCYJ20170307150024907 and Guangdong Provincial Engineering Technology Research Center for Data Science 2016KF09.

Author information

Authors and Affiliations

School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, China
Zhishan Zhao, Jiachen Du, Qinghong Gao & Ruifeng Xu
School of Mathematics and Computer Science, Fuzhou University, Fuzhou, China
Lin Gui
Guangdong Provincial Engineering Technology Research Center for Data Science, Guangzhou, China
Ruifeng Xu

Authors

Zhishan Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Jiachen Du
View author publications
You can also search for this author in PubMed Google Scholar
Qinghong Gao
View author publications
You can also search for this author in PubMed Google Scholar
Lin Gui
View author publications
You can also search for this author in PubMed Google Scholar
Ruifeng Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ruifeng Xu .

Editor information

Editors and Affiliations

Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Xueqi Cheng
Beijing Jinri Toutiao Technology Co. Ltd , Beijing, China
Weiying Ma
Arizona State University , Tempe, Arizona, USA
Huan Liu
Institute of Computing Technology, Chinese Academy of Sciences , Beijing, China
Huawei Shen
Renmin University of China , Beijing, China
Shizheng Feng
Microsoft Asia Research , Beijing, China
Xing Xie

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhao, Z., Du, J., Gao, Q., Gui, L., Xu, R. (2017). Inferring User Profile Using Microblog Content and Friendship Network. In: Cheng, X., Ma, W., Liu, H., Shen, H., Feng, S., Xie, X. (eds) Social Media Processing. SMP 2017. Communications in Computer and Information Science, vol 774. Springer, Singapore. https://doi.org/10.1007/978-981-10-6805-8_3

Download citation

DOI: https://doi.org/10.1007/978-981-10-6805-8_3
Published: 26 October 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6804-1
Online ISBN: 978-981-10-6805-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics