Abstract
The deep understanding of online users on the basis of their behavior data is critical to providing personalized services to them. However, the existing methods for learning user representations are usually based on supervised frameworks such as demographic prediction and product recommendation. In addition, these methods highly rely on labeled data to learn user-representation models, and the user representations learned using these methods can only be used in specific tasks. Motivated by the success of pretrained word embeddings in many natural language processing (NLP) tasks, we propose a simple but effective neural user-embedding approach to learn the deep representations of online users by using their unlabeled behavior data. Once the users are encoded to low-dimensional dense embedding vectors, these hidden user vectors can be used as additional user features in various user-involved tasks, such as demographic prediction, to enrich user representation. In our neural user embedding (NEU) approach, the behavior events are represented in two ways. The first one is the ID-based event embedding, which is based on the IDs of these events, and the second one is the text-based event embedding, which is based on the textual content of these events. Furthermore, we conduct experiments on a real-world web browsing dataset. The results show that our approach can learn informative user embeddings by using the unlabeled browsing-behavior data and that these user embeddings can facilitate many tasks that involve user modeling such as user-age prediction and -gender prediction.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
References
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: ICLR (2015)
Chen, C., Zhang, M., Liu, Y., Ma, S.: Neural attentional rating regression with review-level explanations. In: WWW, pp. 1583–1592 (2018)
Chen, W., et al.: Semi-supervised user profiling with heterogeneous graph attention networks. In: IJCAI, pp. 2116–2122 (2019)
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)
Farnadi, G., Tang, J., De Cock, M., Moens, M.F.: User profiling through deep multimodal fusion. In: WSDM, pp. 171–179 (2018)
Gao, M., Chen, L., He, X., Zhou, A.: BiNE: bipartite network embedding. In: SIGIR, pp. 715–724 (2018)
Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: AISTATS, pp. 315–323 (2011)
Goel, S., Hofman, J.M., Sirer, M.I.: Who does what on the web: A large-scale study of browsing behavior. In: ICWSM, pp. 120–137 (2012)
Grover, A., Leskovec, J.: node2vec: calable feature learning for networks. In: KDD, pp. 855–864 (2016)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Hu, J., Zeng, H.J., Li, H., Niu, C., Chen, Z.: Demographic prediction based on user’s browsing behavior. In: WWW, pp. 151–160 (2007)
Kim, R., Kim, H., Lee, J., Kang, J.: Predicting multiple demographic attributes with task specific embedding transformation and attention network. In: SDM, pp. 765–773 (2020)
Kim, S.M., Xu, Q., Qu, L., Wan, S., Paris, C.: Demographic inference on twitter using recursive neural networks. In: ACL, pp. 471–477 (2017)
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Lu, Y., Dong, R., Smyth, B.: Coevolutionary recommendation model: mutual learning between ratings and reviews. In: WWW, pp. 773–782 (2018)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: ICLR Workshop (2013)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NeurIPS (2013)
Nguyen, D., Gravel, R., Trieschnigg, D., Meder, T.: “How old do you think I am?” A study of language and age in Twitter. In: ICWSM, pp. 439–448 (2013)
Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: Online learning of social representations. In: KDD, pp. 701–710 (2014)
Peters, M., et al.: Deep contextualized word representations. In: NAACL, pp. 2227–2237 (2018)
Rosenthal, S., McKeown, K.: Age prediction in blogs: a study of style, content, and online behavior in pre-and post-social media generations. In: ACL (2011)
Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., Mei, Q.: LINE: Large-scale information network embedding. In: WWW, pp. 1067–1077 (2015)
Tu, C., Liu, H., Liu, Z., Sun, M.: CANE: context-aware network embedding for relation modeling. In: ACL, pp. 1722–1731 (2017)
Wang, H., Zhang, F., Xie, X., Guo, M.: DKN: deep knowledge-aware network for news recommendation. In: WWW, pp. 1835–1844 (2018)
Wang, P., Guo, J., Lan, Y., Xu, J., Cheng, X.: Multi-task representation learning for demographic prediction. In: ECIR, pp. 88–99 (2016)
Wu, C., Wu, F., Liu, J., He, S., Huang, Y., Xie, X.: Neural demographic prediction using search query. In: WSDM, pp. 654–662 (2019)
Zhang, D., Li, S., Wang, H., Zhou, G.: User classification with multiple textual perspectives. In: COLING, pp. 2112–2121 (2016)
Zhou, G., et al.: Deep interest network for click-through rate prediction. In: KDD (2018)
Acknowledgments
The main part of this work was done when the two authors were interns at Microsoft Research Asia. The authors would like to thank Fangzhao Wu, Zheng Liu, and Xing Xie (Microsoft Research Asia) for their support and discussions. This work was partially supported by the Institute for Basic Science (IBS-R029-C2).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
An, M., Kim, S. (2021). Neural User Embedding from Browsing Events. In: Dong, Y., Mladenić, D., Saunders, C. (eds) Machine Learning and Knowledge Discovery in Databases: Applied Data Science Track. ECML PKDD 2020. Lecture Notes in Computer Science(), vol 12460. Springer, Cham. https://doi.org/10.1007/978-3-030-67667-4_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-67667-4_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-67666-7
Online ISBN: 978-3-030-67667-4
eBook Packages: Computer ScienceComputer Science (R0)