ABSTRACT
Attaching tags to items, such as books or movies, is found in many online systems. While a majority of these systems use binary tags, continuous item-tag relevance scores, such as those in tag genome, offer richer descriptions of item content. For example, tag genome for movies assigns the tag “gangster” to the movie “The Godfather (1972)” with a score of 0.93 on a scale of 0 to 1. Tag genome has received considerable attention in recommender systems research and has been used in a wide variety of studies, from investigating the effects of recommender systems on users to generating ideas for movies that appeal to certain user groups.
In this paper, we present tag genome for books, a dataset containing book-tag relevance scores, where a significant number of tags overlap with those from tag genome for movies. To generate our dataset, we designed a survey based on popular books and tags from the Goodreads dataset. In our survey, we asked users to provide ratings for how well tags applied to books. We generated book-tag relevance scores based on user ratings along with features from the Goodreads dataset. In addition to being used to create book recommender systems, tag genome for books can be combined with the tag genome for movies to tackle cross-domain problems, such as recommending books based on movie preferences.
- [n.d.]. Amazon Mechanical Turk. mturk.com. [Online; accessed 09-June-2021].Google Scholar
- [n.d.]. Goodreads | Meet your next favorite book. https://www.goodreads.com/. [Online; accessed 09-June-2021].Google Scholar
- [n.d.]. Instagram. https://instagram.com/. [Online; accessed 09-June-2021].Google Scholar
- [n.d.]. Internet Movie Database. https://imdb.com/. [Online; accessed 09-June-2021].Google Scholar
- [n.d.]. MovieLens. Non-commercial, personalized movie recommendations.https://movielens.org/. [Online; accessed 09-June-2021].Google Scholar
- Konstantinos Bougiatiotis and Theodoros Giannakopoulos. 2016. Content representation and similarity of movies based on topic extraction from subtitles. In Proceedings of the 9th Hellenic Conference on Artificial Intelligence. 1–7.Google ScholarDigital Library
- Konstantinos Bougiatiotis and Theodoros Giannakopoulos. 2018. Enhanced movie content similarity based on textual, auditory and visual information. Expert Systems with Applications 96 (2018), 86–102.Google ScholarCross Ref
- Iván Cantador, Ignacio Fernández-Tobías, Shlomo Berkovsky, and Paolo Cremonesi. 2015. Cross-domain recommender systems. In Recommender systems handbook. Springer, 919–959.Google ScholarDigital Library
- Shuo Chang, F Maxwell Harper, and Loren Terveen. 2015. Using groups of items for preference elicitation in recommender systems. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing. 1258–1269.Google ScholarDigital Library
- Shuo Chang, F Maxwell Harper, and Loren Gilbert Terveen. 2016. Crowd-based personalized natural language explanations for recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems. 175–182.Google ScholarDigital Library
- Scott Deerwester, Susan T Dumais, George W Furnas, Thomas K Landauer, and Richard Harshman. 1990. Indexing by latent semantic analysis. Journal of the American society for information science 41, 6(1990), 391–407.Google ScholarCross Ref
- Joaquin Derrac and Steven Schockaert. 2015. Inducing semantic relations from conceptual spaces: a data-driven approach to plausible reasoning. Artificial Intelligence 228 (2015), 66–94.Google ScholarDigital Library
- Michael D Ekstrand, F Maxwell Harper, Martijn C Willemsen, and Joseph A Konstan. 2014. User perception of differences in recommender algorithms. In Proceedings of the 8th ACM Conference on Recommender systems. 161–168.Google ScholarDigital Library
- Bu Sung Kim, Heera Kim, Jaedong Lee, and Jee-Hyong Lee. 2014. Improving a recommender system by collective matrix factorization with tag information. In 2014 Joint 7th International Conference on Soft Computing and Intelligent Systems (SCIS) and 15th International Symposium on Advanced Intelligent Systems (ISIS). IEEE, 980–984.Google ScholarCross Ref
- Denis Kotkov, Joseph A Konstan, Qian Zhao, and Jari Veijalainen. 2018. Investigating serendipity in recommender systems based on real user feedback. In Proceedings of the 33rd Annual ACM Symposium on Applied Computing. 1341–1350.Google ScholarDigital Library
- Denis Kotkov, Alexandr Maslov, and Mats Neovius. 2021. Revisiting the Tag Relevance Prediction Problem. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (Virtual Event, Canada) (SIGIR ’21). Association for Computing Machinery, New York, NY, USA, 1768–1772. https://doi.org/10.1145/3404835.3463019Google ScholarDigital Library
- Paul Lamere. 2008. Social tagging and music information retrieval. Journal of new music research 37, 2 (2008), 101–114.Google ScholarCross Ref
- Benedikt Loepp, Tim Donkers, Timm Kleemann, and Jürgen Ziegler. 2019. Interactive recommending with tag-enhanced matrix factorization (TagMF). International Journal of Human-Computer Studies 121 (2019), 21–41.Google ScholarCross Ref
- Tien T Nguyen, F Maxwell Harper, Loren Terveen, and Joseph A Konstan. 2018. User personality and user satisfaction with recommender systems. Information Systems Frontiers 20, 6 (2018), 1173–1189.Google ScholarDigital Library
- Tien T Nguyen, Pik-Mai Hui, F Maxwell Harper, Loren Terveen, and Joseph A Konstan. 2014. Exploring the filter bubble: the effect of using recommender systems on content diversity. In Proceedings of the 23rd international conference on World wide web. ACM, 677–686.Google ScholarDigital Library
- Tien T Nguyen, Daniel Kluver, Ting-Yu Wang, Pik-Mai Hui, Michael D Ekstrand, Martijn C Willemsen, and John Riedl. 2013. Rating support interfaces to improve user experience and recommender accuracy. In Proceedings of the 7th ACM Conference on Recommender Systems. 149–156.Google ScholarDigital Library
- Martin F Porter. 1980. An algorithm for suffix stripping. Program (1980).Google Scholar
- Tobias Schnabel, Paul N Bennett, Susan T Dumais, and Thorsten Joachims. 2018. Short-term satisfaction and long-term coverage: Understanding how users tolerate algorithmic exploration. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. 513–521.Google ScholarDigital Library
- Shilad Sen, F Maxwell Harper, Adam LaPitz, and John Riedl. 2007. The quest for quality tags. In Proceedings of the 2007 international ACM conference on Supporting group work. 361–370.Google ScholarDigital Library
- Amos Tversky and Daniel Kahneman. 1974. Judgment under uncertainty: Heuristics and biases. science 185, 4157 (1974), 1124–1131.Google Scholar
- Jesse Vig, Shilad Sen, and John Riedl. 2012. The Tag Genome: Encoding Community Knowledge to Support Novel Interaction. ACM Trans. Interact. Intell. Syst. 2, 3, Article 13 (Sept. 2012), 44 pages. https://doi.org/10.1145/2362394.2362395Google ScholarDigital Library
- Thanh Vinh Vo and Harold Soh. 2018. Generation meets recommendation: proposing novel items for groups of users. In Proceedings of the 12th ACM Conference on Recommender Systems. 145–153.Google ScholarDigital Library
- Mengting Wan and Julian J. McAuley. 2018. Item recommendation on monotonic behavior chains. In Proceedings of the 12th ACM Conference on Recommender Systems, RecSys 2018, Vancouver, BC, Canada, October 2-7, 2018, Sole Pera, Michael D. Ekstrand, Xavier Amatriain, and John O’Donovan (Eds.). ACM, 86–94. https://doi.org/10.1145/3240323.3240369Google ScholarDigital Library
- Mengting Wan, Rishabh Misra, Ndapa Nakashole, and Julian J. McAuley. 2019. Fine-Grained Spoiler Detection from Large-Scale Review Corpora. (2019), 2605–2610. https://doi.org/10.18653/v1/p19-1248Google Scholar
- Nianwen Xue, Edward Bird, 2011. Natural language processing with python. Natural Language Engineering 17, 3 (2011), 419.Google ScholarDigital Library
- Yuan Yao and F Maxwell Harper. 2018. Judging similarity: a user-centric study of related item recommendations. In Proceedings of the 12th ACM Conference on Recommender Systems. ACM, 288–296.Google ScholarDigital Library
Index Terms
- The Tag Genome Dataset for Books
Recommendations
Revisiting the Tag Relevance Prediction Problem
SIGIR '21: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information RetrievalTraditionally, recommender systems provide a list of suggestions to a user based on past interactions with items of this user. These recommendations are usually based on user preferences for items and generated with a delay. Critiquing recommender ...
Rating consistency is consistently underrated: an exploratory analysis of movie-tag rating inconsistency
SAC '22: Proceedings of the 37th ACM/SIGAPP Symposium on Applied ComputingContent-based and hybrid recommender systems rely on item-tag ratings to make recommendations. An example of an item-tag rating is the degree to which the tag "comedy" applies to the movie "Back to the Future (1985)". Ratings are often generated by ...
The Tag Genome: Encoding Community Knowledge to Support Novel Interaction
Special Issue on Common Sense for Interactive SystemsThis article introduces the tag genome, a data structure that extends the traditional tagging model to provide enhanced forms of user interaction. Just as a biological genome encodes an organism based on a sequence of genes, the tag genome encodes an ...
Comments