ABSTRACT
In this paper we describe our solution for the RecSys Challenge 20211 focused on tweet recommendation to increase the number of user interactions. A large database with 800 million tweets produced over four weeks is used as training to predict four types of interactions: Like, Reply, Retweet and Quote. The proposed challenge is very similar to last year’s but incorporates a new fairness metric. Authors are splited into groups based on their number of followers and the final score is computed by averaging PRAUC and RCE of these groups. To satisfy this new restriction, we present a two-branch architecture that separates authors according to their total number of interactions in the dataset. In this way, authors who appear a few number of times (cold-start users) are predicted using similar users. The same is true for users with many interactions (active users). Each of the branches consists of a concatenation of four LightGBM models, one per target. They all use features we extracted from the interaction but they also use the output of the previous model. We first predict Like and use the output to predict Retweet. Then we predict Reply using Like and Retweet and so on. The users’ popularity, as well as the first and last words of the tweet text, turned out to be the best features for our method. Our solution obtained the 5th place in the final ranking and won the 2nd prize in the academic category. All the source code is available online2.
- Vito Walter Anelli, Saikishore Kalloori, Bruce Ferwerda, Luca Belli, Alykhan Tejani, Frank Portman, Alexandre Lung-Yut-Fong, Ben Chamberlain, Yuanpu Xie, Jonathan Hunt, Michael M. Bronstein, and Wenzhe Shi. 2021. RecSys 2021 Challenge Workshop: Fairness-aware engagement prediction at scale on Twitter’s Home Timeline. In RecSys ’21: Fifteenth ACM Conference on Recommender Systems, Amsterdam, The Netherlands, 27 September 2021 - 1 October 2021, Humberto Jesús Corona Pampín, Martha A. Larson, Martijn C. Willemsen, Joseph A. Konstan, Julian J. McAuley, Jean Garcia-Gathright, Bouke Huurnink, and Even Oldridge (Eds.). ACM, 819–824. https://doi.org/10.1145/3460231.3478515Google Scholar
- Luca Belli, Sofia Ira Ktena, Alykhan Tejani, Alexandre Lung-Yut-Fon, Frank Portman, Xiao Zhu, Yuanpu Xie, Akshay Gupta, Michael Bronstein, Amra Delić, Gabriele Sottocornola, Walter Anelli, Nazareno Andrade, Jessie Smith, and Wenzhe Shi. 2020. Privacy-Preserving Recommender Systems Challenge on Twitter’s Home Timeline. (2020). arxiv:2004.13715 [cs.SI]Google Scholar
- Luca Belli, Alykhan Tejani, Frank Portman, Alexandre Lung-Yut-Fong, Ben Chamberlain, Yuanpu Xie, Kristian Lum, Jonathan Hunt, Michael Bronstein, Vito Walter Anelli, Saikishore Kalloori, Bruce Ferwerda, and Wenzhe Shi. 2021. The 2021 RecSys Challenge Dataset: Fairness is not optional. arxiv:2109.08245 [cs.SI]Google Scholar
- Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arxiv:1810.04805 [cs.CL]Google Scholar
- Peter Flach and Meelis Kull. 2015. Precision-recall-gain curves: PR analysis done right. In Advances in neural information processing systems. 838–846.Google Scholar
- Pere Gilabert and Santi Seguí. 2020. Gradient Boosting and Language Model Ensemble for Tweet Recommendation. In Proceedings of the Recommender Systems Challenge 2020. 24–28.Google Scholar
- Shuhei Goda, Naomichi Agata, and Yuya Matsumura. 2020. A Stacking Ensemble Model for Prediction of Multi-type Tweet Engagements. In Proceedings of the Recommender Systems Challenge 2020. 6–10.Google Scholar
- Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. 2017. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. (2017), 3146–3154. http://papers.nips.cc/paper/6907-lightgbm-a-highly-efficient-gradient-boosting-decision-tree.pdfGoogle Scholar
- Benedikt Schifferer, Gilberto Titericz, Chris Deotte, Christof Henkel, Kazuki Onodera, Jiwei Liu, Bojan Tunguz, Even Oldridge, Gabriel De Souza Pereira Moreira, and Ahmet Erdem. 2020. GPU Accelerated Feature Engineering and Training for Recommender Systems. In Proceedings of the Recommender Systems Challenge 2020. 16–23.Google Scholar
- Maksims Volkovs, Zhaoyue Cheng, Mathieu Ravaut, Hojin Yang, Kevin Shen, Jin Peng Zhou, Anson Wong, Saba Zuberi, Ivan Zhang, Nick Frosst, 2020. Predicting Twitter Engagement With Deep Language Models. In Proceedings of the Recommender Systems Challenge 2020. 38–43.Google Scholar
Recommendations
Collaborative personalized tweet recommendation
SIGIR '12: Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrievalTwitter has rapidly grown to a popular social network in recent years and provides a large number of real-time messages for users. Tweets are presented in chronological order and users scan the followees' timelines to find what they are interested in. ...
Addressing cold-start in app recommendation: latent user models constructed from twitter followers
SIGIR '13: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrievalAs a tremendous number of mobile applications (apps) are readily available, users have difficulty in identifying apps that are relevant to their interests. Recommender systems that depend on previous user ratings (i.e., collaborative filtering, or CF) ...
Equivariant Learning for Out-of-Distribution Cold-start Recommendation
MM '23: Proceedings of the 31st ACM International Conference on MultimediaRecommender systems rely on user-item interactions to learn Collaborative Filtering (CF) signals and easily under-recommend the cold-start items without historical interactions. To boost cold-start item recommendation, previous studies usually ...
Comments