skip to main content
10.1145/3386392.3399297acmconferencesArticle/Chapter ViewAbstractPublication PagesumapConference Proceedingsconference-collections
research-article

Persona Prototypes for Improving the Qualitative Evaluation of Recommendation Systems

Published:13 July 2020Publication History

ABSTRACT

The majority of existing research in the field of recommendation systems is aimed at optimizing accuracy metrics for given datasets, which leads to an algorithm-driven design of resulting solutions. Given a lack of understanding of the dataset characteristics and insufficient diversity of represented individuals, such approaches lead to amplifying the hidden data biases and existing disparities. In this research, we address this problem by proposing a Persona Prototyping approach that selects a set of the most representative user individuals to help in understanding the complex distribution of user interests and performing a proper qualitative evaluation of recommendation algorithms. A hierarchical density-based clustering technique is applied to distinguish diverse user groups and select their prototypes. Each of the selected representatives is presented in an easily understandable form of a textual user story describing the prototype behaviors, inspired by the concept of persona from the interaction design. We evaluated the diversity and representativeness of selected individuals and the results show that the proposed method is capable of identifying diverse interest archetypes and can be used to improve the qualitative analysis of recommendations and to test how well they respond to the diversity of user needs.

Skip Supplemental Material Section

Supplemental Material

3386392.3399297.mp4

mp4

27.5 MB

References

  1. Ricardo Baeza-Yates. 2018. Bias on the web. Commun. ACM 61 (05 2018), 54--61. https://doi.org/10.1145/3209581Google ScholarGoogle Scholar
  2. Yoshua Bengio. 2019. From System 1 Deep Learning to System 2 Deep Learning. https://nips.cc/Conferences/2019/ScheduleMultitrack?event=15488.Google ScholarGoogle Scholar
  3. Jacob Bien and Robert Tibshirani. 2011. Prototype selection for interpretable classification. The Annals of Applied Statistics5, 4 (Dec 2011), 2403--2424. https://doi.org/10.1214/11-aoas495Google ScholarGoogle ScholarCross RefCross Ref
  4. David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet Allocation. J. Mach. Learn. Res.3 (March 2003), 993--1022. http://dl.acm.org/citation.cfm?id=944919.944937Google ScholarGoogle Scholar
  5. Tolga Bolukbasi, Kai-Wei Chang, James Zou, Venkatesh Saligrama, and Adam Kalai. 2016. Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings. In Proceedings of the 30th International Conference on Neural Information Processing Systems (NIPS'16). Curran Associates Inc., Red Hook, NY, USA, 4356--4364.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. K. Bradley and B. Smyth. 2001. Improving Recommendation Diversity. In Proceedings of the 12th National Conference in Artificial Intelligence and Cognitive Science, Diarmuid O'Donoghue (Ed.). Maynooth, Ireland, 75--84.Google ScholarGoogle Scholar
  7. Tadeusz Cali'ski and Harabasz JA. 1974. A Dendrite Method for Cluster Analysis. Communications in Statistics - Theory and Methods 3 (01 1974), 1--27. https://doi.org/10.1080/03610927408827101Google ScholarGoogle Scholar
  8. Ricardo J. G. B. Campello, Davoud Moulavi, and Joerg Sander. 2013. Density-Based Clustering Based on Hierarchical Density Estimates. In Advances in Knowledge Discovery and Data Mining, Jian Pei, Vincent S. Tseng, Longbing Cao, Hiroshi Motoda, and Guandong Xu (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 160--172.Google ScholarGoogle Scholar
  9. David Caswell and Konstantin Dorr. 2017. Automated Journalism 2.0: Event-Driven Narratives. From simple descriptions to real stories. Journalism Practice(05 2017).Google ScholarGoogle Scholar
  10. Alan Cooper, Robert Reimann, and Dave Cronin. 2014. About Face: The Essentials of Interaction Design. John Wiley & Sons, Inc., New York, NY, USA.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Abhinandan S. Das, Mayur Datar, Ashutosh Garg, and Shyam Rajaram. 2007. Google news personalization: scalable online collaborative filtering. In WWW'07: Proceedings of the 16th international conference on World Wide Web. ACM, New York, NY, USA, 271--280. https://doi.org/10.1145/1242572.1242610Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. D. L. Davies and D. W. Bouldin. 1979. A Cluster Separation Measure. IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-1, 2 (April 1979),224--227. https://doi.org/10.1109/TPAMI.1979.4766909Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Martin Ester, Hans-Peter Kriegel, Jörg Sander, and Xiaowei Xu. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. AAAI Press, 226--231.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Sahin Cem Geyik, Ali Dasdan, and Kuang-Chih Lee. 2015. User Clustering in On-line Advertising via Topic Models. CoRRabs/1501.06595 (2015). arXiv:1501.06595 http://arxiv.org/abs/1501.06595Google ScholarGoogle Scholar
  15. Eduardo Graells-Garrido, Mounia Lalmas, and Filippo Menczer. 2015. First Women, Second Sex: Gender Bias in Wikipedia. CoRRabs/1502.02341 (2015). arXiv:1502.02341 http://arxiv.org/abs/1502.02341Google ScholarGoogle Scholar
  16. Riccardo Guidotti, Anna Monreale, Franco Turini, Dino Pedreschi, and Fosca Giannotti. 2018. A Survey Of Methods For Explaining Black Box Models. CoRRabs/1802.01933 (2018). arXiv:1802.01933 http://arxiv.org/abs/1802.01933Google ScholarGoogle Scholar
  17. F. Maxwell Harper and Joseph A. Konstan. 2015. The Movie Lens Datasets: History and Context. ACM Trans. Interact. Intell. Syst.5, 4, Article 19 (Dec. 2015), 19 pages.https://doi.org/10.1145/2827872Google ScholarGoogle Scholar
  18. Christian Hennig. 2017. Cluster validation by measurement of clustering characteristics relevant to the user. arXiv:stat.ME/1703.09282Google ScholarGoogle Scholar
  19. Aylin Caliskan Islam, Joanna J. Bryson, and Arvind Narayanan. 2016. Semantics derived automatically from language corpora necessarily contain human biases. CoRRabs/1608.07187 (2016). arXiv:1608.07187 http://arxiv.org/abs/1608.07187Google ScholarGoogle Scholar
  20. Anil K. Jain and Richard C. Dubes. 1988. Algorithms for Clustering Data. Prentice-Hall, Inc., Upper Saddle River, NJ, USA.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Daniel Kahneman. 2011.Thinking, fast and slow. Farrar, Straus and Giroux, New York. https://www.amazon.de/Thinking-Fast-Slow-Daniel-Kahneman/dp/0374275637/ref=wl_it_dp_o_pdT1_nS_nC?ie=UTF8&colid=151193SNGKJT9&coliid=I3OCESLZCVDFL7Google ScholarGoogle Scholar
  22. Been Kim, Rajiv Khanna, and Oluwasanmi O Koyejo. 2016. Examples are not enough, learn to criticize! Criticism for Interpretability. In Advances in Neural Information Processing Systems 29, D. D.Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett (Eds.). Curran Associates, Inc., 2280--2288.http://papers.nips.cc/paper/6300-examples-are-not-enough-learn-to-criticize-criticism-for-interpretability.pdfGoogle ScholarGoogle Scholar
  23. Been Kim, Cynthia Rudin, and Julie Shah. 2015. The Bayesian Case Model: A Generative Approach for Case-Based Reasoning and Prototype Classification. arXiv:stat.ML/1503.01161Google ScholarGoogle Scholar
  24. Leland McInnes, John Healy, and James Melville. 2018. UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. arXiv:stat.ML/1802.03426Google ScholarGoogle Scholar
  25. Christoph Molnar. 2019. Interpretable Machine Learning. https://christophm.github.io/interpretable-ml-book/.Google ScholarGoogle Scholar
  26. Donald A. Norman. 2002. The Design of Everyday Things. Basic Books, Inc., NewYork, NY, USA.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Juni Nurma Sari, Lukito Nugroho, Ridi Ferdiana, and Paulus Santosa. 2016. Reviewon Customer Segmentation Technique on E-commerce. Advanced Science Letters 22 (10 2016), 3018--3022. https://doi.org/10.1166/asl.2016.7985Google ScholarGoogle Scholar
  28. Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. "Why Should I Trust You?": Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13--17, 2016. 1135--1144.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Peter J. Rousseeuw. 1987. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math.20 (1987), 53 -- 65.https://doi.org/10.1016/0377-0427(87)90125--7Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Badrul M. Sarwar, George Karypis, Joseph Konstan, and John Reidl. 2002. Recommender Systems for Large-Scale E-Commerce: Scalable Neighborhood Formation Using Clustering. In Proceedings of the 5th International Conference on Computer and Information Technology (ICCIT).Google ScholarGoogle Scholar
  31. Nava Tintarev and Judith Masthoff. 2011. Designing and Evaluating Explanations for Recommender Systems. In Recommender Systems Handbook, Francesco Ricci, Lior Rokach, Bracha Shapira, and Paul B. Kantor (Eds.). Springer US, 479--510. https://doi.org/10.1007/978-0--387--85820--3_15Google ScholarGoogle Scholar
  32. Virginia Tsintzou, Evaggelia Pitoura, and Panayiotis Tsaparas. 2018. Bias Disparity in Recommendation Systems. CoRRabs/1811.01461 (2018). arXiv:1811.01461http://arxiv.org/abs/1811.01461Google ScholarGoogle Scholar
  33. Amos Tversky and Daniel Kahneman.1974. Judgment under Uncertainty: Heuristics and Biases. Science 185,4157(1974),1124--1131.https://doi.org/10.1126/science.185.4157.1124arXiv:https://science.sciencemag.org/content/185/4157/1124.full.pdfGoogle ScholarGoogle Scholar
  34. Laurens van der Maaten and Geoffrey Hinton. 2008. Visualizing Data using t-SNE. Journal of Machine Learning Research 9 (2008), 2579--2605. http://www.jmlr.org/papers/v9/vandermaaten08a.htmlGoogle ScholarGoogle Scholar
  35. Yao Wu and Martin Ester. 2015. FLAME: A Probabilistic Model Combining Aspect Based Opinion Mining and Collaborative Filtering. In Proceedings of the Eighth ACM International Conference on Web Search and Data Mining (WSDM '15). ACM,New York, NY, USA, 199--208. https://doi.org/10.1145/2684822.2685291Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Yongfeng Zhang and Xu Chen. 2018. Explainable Recommendation: A Survey and New Perspectives. CoRRabs/1804.11192 (2018). arXiv:1804.11192 http://arxiv.org/abs/1804.11192Google ScholarGoogle Scholar
  37. Yang Zhang, Hesham Mekky, Zhi-Li Zhang, Ruben Torres, Sung-Ju Lee, Alok Tongaonkar, and Marco Mellia. 2015. Detecting malicious activities with user-agent-based profiles. International Journal of Network Management 25, 5 (2015),306--319.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Persona Prototypes for Improving the Qualitative Evaluation of Recommendation Systems

            Recommendations

            Comments

            Login options

            Check if you have access through your login credentials or your institution to get full access on this article.

            Sign in

            PDF Format

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader