skip to main content
10.1145/3498366.3505833acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
demonstration

The Tag Genome Dataset for Books

Published:14 March 2022Publication History

ABSTRACT

Attaching tags to items, such as books or movies, is found in many online systems. While a majority of these systems use binary tags, continuous item-tag relevance scores, such as those in tag genome, offer richer descriptions of item content. For example, tag genome for movies assigns the tag “gangster” to the movie “The Godfather (1972)” with a score of 0.93 on a scale of 0 to 1. Tag genome has received considerable attention in recommender systems research and has been used in a wide variety of studies, from investigating the effects of recommender systems on users to generating ideas for movies that appeal to certain user groups.

In this paper, we present tag genome for books, a dataset containing book-tag relevance scores, where a significant number of tags overlap with those from tag genome for movies. To generate our dataset, we designed a survey based on popular books and tags from the Goodreads dataset. In our survey, we asked users to provide ratings for how well tags applied to books. We generated book-tag relevance scores based on user ratings along with features from the Goodreads dataset. In addition to being used to create book recommender systems, tag genome for books can be combined with the tag genome for movies to tackle cross-domain problems, such as recommending books based on movie preferences.

References

  1. [n.d.]. Amazon Mechanical Turk. mturk.com. [Online; accessed 09-June-2021].Google ScholarGoogle Scholar
  2. [n.d.]. Goodreads | Meet your next favorite book. https://www.goodreads.com/. [Online; accessed 09-June-2021].Google ScholarGoogle Scholar
  3. [n.d.]. Instagram. https://instagram.com/. [Online; accessed 09-June-2021].Google ScholarGoogle Scholar
  4. [n.d.]. Internet Movie Database. https://imdb.com/. [Online; accessed 09-June-2021].Google ScholarGoogle Scholar
  5. [n.d.]. MovieLens. Non-commercial, personalized movie recommendations.https://movielens.org/. [Online; accessed 09-June-2021].Google ScholarGoogle Scholar
  6. Konstantinos Bougiatiotis and Theodoros Giannakopoulos. 2016. Content representation and similarity of movies based on topic extraction from subtitles. In Proceedings of the 9th Hellenic Conference on Artificial Intelligence. 1–7.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Konstantinos Bougiatiotis and Theodoros Giannakopoulos. 2018. Enhanced movie content similarity based on textual, auditory and visual information. Expert Systems with Applications 96 (2018), 86–102.Google ScholarGoogle ScholarCross RefCross Ref
  8. Iván Cantador, Ignacio Fernández-Tobías, Shlomo Berkovsky, and Paolo Cremonesi. 2015. Cross-domain recommender systems. In Recommender systems handbook. Springer, 919–959.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Shuo Chang, F Maxwell Harper, and Loren Terveen. 2015. Using groups of items for preference elicitation in recommender systems. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing. 1258–1269.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Shuo Chang, F Maxwell Harper, and Loren Gilbert Terveen. 2016. Crowd-based personalized natural language explanations for recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems. 175–182.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Scott Deerwester, Susan T Dumais, George W Furnas, Thomas K Landauer, and Richard Harshman. 1990. Indexing by latent semantic analysis. Journal of the American society for information science 41, 6(1990), 391–407.Google ScholarGoogle ScholarCross RefCross Ref
  12. Joaquin Derrac and Steven Schockaert. 2015. Inducing semantic relations from conceptual spaces: a data-driven approach to plausible reasoning. Artificial Intelligence 228 (2015), 66–94.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Michael D Ekstrand, F Maxwell Harper, Martijn C Willemsen, and Joseph A Konstan. 2014. User perception of differences in recommender algorithms. In Proceedings of the 8th ACM Conference on Recommender systems. 161–168.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Bu Sung Kim, Heera Kim, Jaedong Lee, and Jee-Hyong Lee. 2014. Improving a recommender system by collective matrix factorization with tag information. In 2014 Joint 7th International Conference on Soft Computing and Intelligent Systems (SCIS) and 15th International Symposium on Advanced Intelligent Systems (ISIS). IEEE, 980–984.Google ScholarGoogle ScholarCross RefCross Ref
  15. Denis Kotkov, Joseph A Konstan, Qian Zhao, and Jari Veijalainen. 2018. Investigating serendipity in recommender systems based on real user feedback. In Proceedings of the 33rd Annual ACM Symposium on Applied Computing. 1341–1350.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Denis Kotkov, Alexandr Maslov, and Mats Neovius. 2021. Revisiting the Tag Relevance Prediction Problem. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (Virtual Event, Canada) (SIGIR ’21). Association for Computing Machinery, New York, NY, USA, 1768–1772. https://doi.org/10.1145/3404835.3463019Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Paul Lamere. 2008. Social tagging and music information retrieval. Journal of new music research 37, 2 (2008), 101–114.Google ScholarGoogle ScholarCross RefCross Ref
  18. Benedikt Loepp, Tim Donkers, Timm Kleemann, and Jürgen Ziegler. 2019. Interactive recommending with tag-enhanced matrix factorization (TagMF). International Journal of Human-Computer Studies 121 (2019), 21–41.Google ScholarGoogle ScholarCross RefCross Ref
  19. Tien T Nguyen, F Maxwell Harper, Loren Terveen, and Joseph A Konstan. 2018. User personality and user satisfaction with recommender systems. Information Systems Frontiers 20, 6 (2018), 1173–1189.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Tien T Nguyen, Pik-Mai Hui, F Maxwell Harper, Loren Terveen, and Joseph A Konstan. 2014. Exploring the filter bubble: the effect of using recommender systems on content diversity. In Proceedings of the 23rd international conference on World wide web. ACM, 677–686.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Tien T Nguyen, Daniel Kluver, Ting-Yu Wang, Pik-Mai Hui, Michael D Ekstrand, Martijn C Willemsen, and John Riedl. 2013. Rating support interfaces to improve user experience and recommender accuracy. In Proceedings of the 7th ACM Conference on Recommender Systems. 149–156.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Martin F Porter. 1980. An algorithm for suffix stripping. Program (1980).Google ScholarGoogle Scholar
  23. Tobias Schnabel, Paul N Bennett, Susan T Dumais, and Thorsten Joachims. 2018. Short-term satisfaction and long-term coverage: Understanding how users tolerate algorithmic exploration. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. 513–521.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Shilad Sen, F Maxwell Harper, Adam LaPitz, and John Riedl. 2007. The quest for quality tags. In Proceedings of the 2007 international ACM conference on Supporting group work. 361–370.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Amos Tversky and Daniel Kahneman. 1974. Judgment under uncertainty: Heuristics and biases. science 185, 4157 (1974), 1124–1131.Google ScholarGoogle Scholar
  26. Jesse Vig, Shilad Sen, and John Riedl. 2012. The Tag Genome: Encoding Community Knowledge to Support Novel Interaction. ACM Trans. Interact. Intell. Syst. 2, 3, Article 13 (Sept. 2012), 44 pages. https://doi.org/10.1145/2362394.2362395Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Thanh Vinh Vo and Harold Soh. 2018. Generation meets recommendation: proposing novel items for groups of users. In Proceedings of the 12th ACM Conference on Recommender Systems. 145–153.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Mengting Wan and Julian J. McAuley. 2018. Item recommendation on monotonic behavior chains. In Proceedings of the 12th ACM Conference on Recommender Systems, RecSys 2018, Vancouver, BC, Canada, October 2-7, 2018, Sole Pera, Michael D. Ekstrand, Xavier Amatriain, and John O’Donovan (Eds.). ACM, 86–94. https://doi.org/10.1145/3240323.3240369Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Mengting Wan, Rishabh Misra, Ndapa Nakashole, and Julian J. McAuley. 2019. Fine-Grained Spoiler Detection from Large-Scale Review Corpora. (2019), 2605–2610. https://doi.org/10.18653/v1/p19-1248Google ScholarGoogle Scholar
  30. Nianwen Xue, Edward Bird, 2011. Natural language processing with python. Natural Language Engineering 17, 3 (2011), 419.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Yuan Yao and F Maxwell Harper. 2018. Judging similarity: a user-centric study of related item recommendations. In Proceedings of the 12th ACM Conference on Recommender Systems. ACM, 288–296.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. The Tag Genome Dataset for Books
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in
          • Published in

            cover image ACM Conferences
            CHIIR '22: Proceedings of the 2022 Conference on Human Information Interaction and Retrieval
            March 2022
            399 pages
            ISBN:9781450391863
            DOI:10.1145/3498366

            Copyright © 2022 Owner/Author

            Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 14 March 2022

            Check for updates

            Qualifiers

            • demonstration
            • Research
            • Refereed limited

            Acceptance Rates

            Overall Acceptance Rate55of163submissions,34%

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format .

          View HTML Format