research-article

Transparent user models for personalization

Authors:

Khalid El-Arini,

Jurgen Van Gael,

Blaise Agüera y ArcasAuthors Info & Claims

KDD '12: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 678 - 686

https://doi.org/10.1145/2339530.2339639

Published: 12 August 2012 Publication History

Abstract

Personalization is a ubiquitous phenomenon in our daily online experience. While such technology is critical for helping us combat the overload of information we face, in many cases, we may not even realize that our results are being tailored to our personal tastes and preferences. Worse yet, when such a system makes a mistake, we have little recourse to correct it.

In this work, we propose a framework for addressing this problem by developing a new user-interpretable feature set upon which to base personalized recommendations. These features, which we call badges, represent fundamental traits of users (e.g., "vegetarian" or "Apple fanboy") inferred by modeling the interplay between a user's behavior and self-reported identity. Specifically, we consider the microblogging site Twitter, where users provide short descriptions of themselves in their profiles, as well as perform actions such as tweeting and retweeting. Our approach is based on the insight that we can define badges using high precision, low recall rules (e.g., "Twitter profile contains the phrase 'Apple fanboy'"), and with enough data, generalize to other users by observing shared behavior. We develop a fully Bayesian, generative model that describes this interaction, while allowing us to avoid the pitfalls associated with having positive-only data.

Experiments on real Twitter data demonstrate the effectiveness of our model at capturing rich and interpretable user traits that can be used to provide transparency for personalization.

Supplementary Material

JPG File (311a_t_talk_5.jpg)

Download
13.83 KB

MP4 File (311a_t_talk_5.mp4)

Download
172.11 MB

References

[1]

D. Andrzejewski, X. Zhu, M. Craven, and B. Recht. A framework for incorporating general domain knowledge into latent Dirichlet allocation using first-order logic. In Proc. IJCAI, 2011.

Digital Library

[2]

D. M. Blei and J. Lafferty. Topic models. In A. Srivastava and M. Sahami, editors, Text Mining: classification clustering, and applications. Chapman and Hall, 2009.

[3]

D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, 3:993--1022, 2003.

Digital Library

[4]

K. El-Arini, U. Paquet, R. Herbrich, J. Van Gael, and B. Agüera y Arcas. Transparent user models for personalization: Supplemental material. http://www.cs.cmu.edu/ kbe/badges.

[5]

E. B. Fox. Bayesian Nonparametric Learning of Complex Dynamical Phenomena. PhD thesis, Massachusetts Institute of Technology, 2009.

[6]

A. Frigessi, P. Di Stefano, C. Hwang, and S. Sheu. Convergence rates of the Gibbs sampler, the Metropolis algorithm and other single-site updating dynamics. Journal of the Royal Statistical Society, Series B, pages 205--219, 1993.

[7]

Y. Koren, R. M. Bell, and C. Volinsky. Matrix factorization techniques for recommender systems. IEEE Computer, 42(8):30--37, 2009.

Digital Library

[8]

J. Liu. Peskun's theorem and a modified discrete-state Gibbs sampler. Biometrika, 83(3):681--682, 1996.

[9]

Y. Low, J. Gonzalez, A. Kyrola, D. Bickson, C. Guestrin, and J. M. Hellerstein. Distributed GraphLab: A framework for machine learning and data mining in the cloud. In PVLDB, 2012.

Digital Library

[10]

E. Pariser. The Filter Bubble. Viking, 2011.

Digital Library

[11]

I. Porteous, A. Asuncion, and M. Welling. Bayesian matrix factorization with side information and Dirichlet process mixtures. In Proc. AAAI, 2010.

[12]

D. Ramage, S. Dumais, and D. Liebling. Characterizing microblogs with topic models. In Proc. ICWSM, 2010.

[13]

D. Ramage, D. Hall, R. Nallapati, and C. D. Manning. Labeled LDA: A supervised topic model for credit attribution in multi-labeled corpora. In EMNLP, 2009.

Digital Library

[14]

D. Ramage and E. Rosen. Stanford topic modeling toolbox. http://nlp.stanford.edu/software/tmt/.

[15]

A. J. Smola and S. Narayanamurthy. An architecture for parallel topic models. PVLDB, 3(1):703--710, 2010.

Digital Library

[16]

J. Zaslow. If TiVo thinks you are gay, here's how to set it straight. The Wall Street Journal, Nov. 26 2002.

Cited By

Musto CNarducci FPolignano MDe Gemmis MLops PSemeraro G(2021) MyrrorBot: A Digital Assistant Based on Holistic User Models for Personalized Access to Online ServicesACM Transactions on Information Systems10.1145/344767939:4(1-34)Online publication date: 16-Aug-2021
https://dl.acm.org/doi/10.1145/3447679
Musto CNarducci FPolignano Mde Gemmis MLops PSemeraro G(2020)Towards Queryable User Profiles: Introducing Conversational Agents in a Platform for Holistic User ModelingAdjunct Publication of the 28th ACM Conference on User Modeling, Adaptation and Personalization10.1145/3386392.3399298(213-218)Online publication date: 14-Jul-2020
https://dl.acm.org/doi/10.1145/3386392.3399298
Cetina Presuel RMartínez Sierra J(2019)Algorithms and the News: Social Media Platforms as News Publishers and DistributorsRevista de Comunicación10.26441/RC18.2-2019-A1318:2(261-285)Online publication date: 26-Aug-2019
https://doi.org/10.26441/RC18.2-2019-A13
Show More Cited By

Index Terms

Transparent user models for personalization
1. Mathematics of computing
  1. Probability and statistics

Recommendations

Transparent, Scrutable and Explainable User Models for Personalized Recommendation
SIGIR'19: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval

Most recommender systems base their recommendations on implicit or explicit item-level feedback provided by users. These item ratings are combined into a complex user model, which then predicts the suitability of other items. While effective, such ...
Generating semantically enriched user profiles for Web personalization

Traditional collaborative filtering generates recommendations for the active user based solely on ratings of items by other users. However, most businesses today have item ontologies that provide a useful source of content descriptors that can be used ...
Collaborative personalized tweet recommendation
SIGIR '12: Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval

Twitter has rapidly grown to a popular social network in recent years and provides a large number of real-time messages for users. Tweets are presented in chronological order and users scan the followees' timelines to find what they are interested in. ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '12: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining

August 2012

1616 pages

ISBN:9781450314626

DOI:10.1145/2339530

General Chair:
Qiang Yang
Hong Kong University of Science and Technology
,
Program Chairs:
Deepak Agarwal
LinkedIn
,
Jian Pei
Simon Fraser University

Copyright © 2012 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 August 2012

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

KDD '12

Sponsor:

KDD '12: The 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

August 12 - 16, 2012

Beijing, China

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Sponsor:
sigkdd
sigkdd

The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 3 - 7, 2025

Toronto , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

14
Total Citations
View Citations
925
Total Downloads

Downloads (Last 12 months)18
Downloads (Last 6 weeks)3

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Musto CNarducci FPolignano MDe Gemmis MLops PSemeraro G(2021) MyrrorBot: A Digital Assistant Based on Holistic User Models for Personalized Access to Online ServicesACM Transactions on Information Systems10.1145/344767939:4(1-34)Online publication date: 16-Aug-2021
https://dl.acm.org/doi/10.1145/3447679
Musto CNarducci FPolignano Mde Gemmis MLops PSemeraro G(2020)Towards Queryable User Profiles: Introducing Conversational Agents in a Platform for Holistic User ModelingAdjunct Publication of the 28th ACM Conference on User Modeling, Adaptation and Personalization10.1145/3386392.3399298(213-218)Online publication date: 14-Jul-2020
https://dl.acm.org/doi/10.1145/3386392.3399298
Cetina Presuel RMartínez Sierra J(2019)Algorithms and the News: Social Media Platforms as News Publishers and DistributorsRevista de Comunicación10.26441/RC18.2-2019-A1318:2(261-285)Online publication date: 26-Aug-2019
https://doi.org/10.26441/RC18.2-2019-A13
Jhaver SKarpfen YAntin JMandryk RHancock MPerry MCox A(2018)Algorithmic Anxiety and Coping Strategies of Airbnb HostsProceedings of the 2018 CHI Conference on Human Factors in Computing Systems10.1145/3173574.3173995(1-12)Online publication date: 21-Apr-2018
https://dl.acm.org/doi/10.1145/3173574.3173995
Sun XChan P(2018)Estimating effectiveness of twitter messages with a personalized machine learning approachKnowledge and Information Systems10.1007/s10115-017-1088-356:1(27-53)Online publication date: 1-Jul-2018
https://dl.acm.org/doi/10.1007/s10115-017-1088-3
Jurgens DTsvetkov YJurafsky D(2017)Writer Profiling Without the Writer’s TextSocial Informatics10.1007/978-3-319-67256-4_43(537-558)Online publication date: 2-Sep-2017
https://doi.org/10.1007/978-3-319-67256-4_43
Kamishima TAkaho SAsoh HSato I(2016)Model-Based Approaches for Independence-Enhanced Recommendation2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW)10.1109/ICDMW.2016.0127(860-867)Online publication date: Dec-2016
https://doi.org/10.1109/ICDMW.2016.0127
Marsden NJones MPalanque PSchmidt AGrossman T(2014)Doing gender in input fieldsCHI '14 Extended Abstracts on Human Factors in Computing Systems10.1145/2559206.2581212(1399-1404)Online publication date: 26-Apr-2014
https://dl.acm.org/doi/10.1145/2559206.2581212
Zhao YChen QYan SZhang DChua T(2014)Community Understanding in Location-based Social NetworksHuman-Centered Social Media Analytics10.1007/978-3-319-05491-9_3(43-74)Online publication date: 25-Mar-2014
https://doi.org/10.1007/978-3-319-05491-9_3
de Alencar Tde Almeida Neris VConte TCastro Tda Silva BJunqueira Barbosa S(2013)Sistemas ubíquos para todosProceedings of the 12th Brazilian Symposium on Human Factors in Computing Systems10.5555/2577101.2577138(178-187)Online publication date: 8-Oct-2013
https://dl.acm.org/doi/10.5555/2577101.2577138
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten