abstract

Driving content recommendations by building a knowledge base using weak supervision and transfer learning

Author:

Sanghamitra DebAuthors Info & Claims

RecSys '19: Proceedings of the 13th ACM Conference on Recommender Systems

Page 531

https://doi.org/10.1145/3298689.3346963

Published: 10 September 2019 Publication History

Get Access

Abstract

With 2.2 million subscribers and two hundred million content views, Chegg is a centralized hub where students come to get help with writing, science, math, and other educational needs. In order to impact a student's learning capabilities we present personalized content to students. Student needs are unique based on their learning style, studying environment and many other factors. Most students will engage with a subset of the products and contents available at Chegg. In order to recommend personalized content to students we have developed a generalized Machine Learning Pipeline that is able to handle training data generation and model building for a wide range of problems. We generate a knowledge base with a hierarchy of concepts and associate student-generated content, such as chat-room data, equations, chemical formulae, reviews, etc with concepts in the knowledge base. Collecting training data to generate different parts of the knowledge base is a key bottleneck in developing NLP models. Employing subject matter experts to provide annotations is prohibitively expensive. Instead, we use weak supervision and active learning techniques, with tools such as snorkel[2], an open source project from Stanford, to make training data generation dramatically easier. With these methods, training data is generated by using broad stroke filters and high precision rules. The rules are modeled probabilistically to incorporate dependencies. Features are generated using transfer learning[1] from language models for classification tasks. We explored several language models and the best performance was from sentence embeddings with skip-thought vectors predicting the previous and the next sentence. The generated structured information is then used to improve product features, and enhance recommendations made to students. In this presentation I will talk about efficient methods of tagging content with categories that come from a knowledge base. Using this information we provide relevant content recommendations to students coming to Chegg for online tutoring, studying flashcards and practicing problems.

References

[1]

Alexis Conneau, Douwe Kiela, Holger Schwenk, Loic Barrault, and Antoine Bordes. 2017. Supervised learning of universal sentence representations from natural language inference data. arXiv preprint arXiv:1705.02364 (2017).

Google Scholar

[2]

Alexander Ratner, Stephen H Bach, Henry Ehrenberg, Jason Fries, Sen Wu, and Christopher Ré. 2017. Snorkel: Rapid training data creation with weak supervision. Proceedings of the VLDB Endowment 11, 3 (2017), 269--282.

Digital Library

Google Scholar

Index Terms

Driving content recommendations by building a knowledge base using weak supervision and transfer learning
1. Applied computing
  1. Document management and text processing
  2. Education
2. Information systems
  1. Information retrieval

Recommendations

Learning to Rank Items of Minimal Reviews Using Weak Supervision
Advances in Knowledge Discovery and Data Mining
Abstract
Customer reviews and star ratings are widely used on E-commerce and reviewing sites for the public to express their opinions. To help the online public make decisions, items (e.g., products, services, movies, books) are typically represented and ...
Few-shot Node Classification with Extremely Weak Supervision
WSDM '23: Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining

Few-shot node classification aims at classifying nodes with limited labeled nodes as references. Recent few-shot node classification methods typically learn from classes with abundant labeled nodes (i.e., meta-training classes) and then generalize to ...
Personalized digital TV content recommendation with integration of user behavior profiling and multimodal content rating

This paper presents the novel development of an embedded system that aims at digital TV content recommendation based on descriptive metadata collected from versatile sources. The described system comprises a user profiling subsystem identifying user ...

Comments

Information & Contributors

Information

Published In

RecSys '19: Proceedings of the 13th ACM Conference on Recommender Systems

September 2019

635 pages

ISBN:9781450362436

DOI:10.1145/3298689

General Chairs:
Toine Bogers
Aalborg University Copenhagen, Denmark
,
Alan Said
University of Gothenburg, Sweden
,
Program Chairs:
Peter Brusilovsky
University of Pittsburgh
,
Domonkos Tikk
Gravity R&D, Hungary

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 September 2019

Check for updates

Author Tags

Qualifiers

Abstract

Conference

RecSys '19

RecSys '19: Thirteenth ACM Conference on Recommender Systems

September 16 - 20, 2019

Copenhagen, Denmark

Acceptance Rates

RecSys '19 Paper Acceptance Rate 36 of 189 submissions, 19%;

Overall Acceptance Rate 254 of 1,295 submissions, 20%

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
294
Total Downloads

Downloads (Last 12 months)8
Downloads (Last 6 weeks)1

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Index Terms

Recommendations

Learning to Rank Items of Minimal Reviews Using Weak Supervision

Few-shot Node Classification with Extremely Weak Supervision

Personalized digital TV content recommendation with integration of user behavior profiling and multimodal content rating

Comments

Information

Published In

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations