invited-talk

Measuring operational quality of recommendations: industry talk abstract

Author:

Lina WeichbrodtAuthors Info & Claims

RecSys '18: Proceedings of the 12th ACM Conference on Recommender Systems

Page 485

https://doi.org/10.1145/3240323.3241725

Published: 27 September 2018 Publication History

Get Access

Abstract

With the rise of machine learning in production, we need to talk about operational data science. The talk introduces a pragmatic method on how to measure the response quality of a recommendation service. To that end the definition of a successful response is introduced and guidelines how to capture the rate of successful responses are presented.

There are several changes that can happen during the serving phase of a model which negatively affect the quality of the algorithmic response. A few examples are:

• The model is updated and the new version is inferior to the previous one.

• The latest deployment of the stack that processes the request and serves the model contains a bug.

• Changes in the infrastructure lead to performance loss. An example in an e-commerce setting is switching to a different microservice to obtain article metadata used for filtering the recommendations.

• The input data changes. Typical reasons might be a client application that releases a bug (e.g., lowercasing a case sensitive identifier) or changes a feature in a way that affects the data distribution such as allowing all users to use the product cart instead of previously allowing it only for logged in users. If the change is not detected training data and serving data diverge.

Current monitoring solutions mostly focus on the completion of a request without errors and the request latency. That means the mentioned examples would be hard to detect despite the response quality being significantly degraded, sometimes permanently.

In addition to not being able to detect the mentioned changes, it can be argued that current monitoring practices are not sufficient to capture the performance of a recommender system or any other data driven service in a meaningful way. We might for instance have returned popular articles as a fallback in a case where personalized recommendations were requested. We should record that response as unsuccessful.

A new paradigm for measuring response quality should fulfil the following criteria:

• comparable across models

• simple and understandable metrics

• measurements are collected in real time

• allows for actionable alerting on problems

The response quality is defined as an approximation of how well the response fits the defined business and modelling case. The goal is to bridge the gap between metrics used during model learning and technical monitoring metrics. Ideally we would like to obtain Service Level Objectives (SLO)[1] that contain this quality aspect and can be discussed with the different client applications based on the business cases, e.g., "85% of the order confirmation emails contain personalized recommendations based on the purchase."

A case study will illustrate how algorithmic monitoring was introduced in the recommendation team at Zalando. Zalando is one of Europe's largest fashion retailers and multiple recommendation algorithms serve many online and offline use cases. You will see several examples of how the monitoring helped to identify bugs or diagnose quality problems.

Supplementary Material

MP4 File (p485-weichbrodt.mp4)

Download
1873.81 MB

Reference

[1]

Betsy Beyer, Chris Jones, Jennifer Petoff, and Niall Richard Murphy. 2017. Site Reliability Engineering: How Google Runs Production Systems. Retrieved August 2, 2018 from https://landing.google.com/sre/book/chapters/service-level-objectives.html

Digital Library

Google Scholar

Cited By

View all

Goti AQuerejeta-Lomas LAlmeida Ade la Puerta JLópez-de-Ipiña D(2023)Artificial Intelligence in Business-to-Customer Fashion Retail: A Literature ReviewMathematics10.3390/math1113294311:13(2943)Online publication date: 30-Jun-2023
https://doi.org/10.3390/math11132943
Querejeta Lomas LGoti Elordi AAlmeida Escondrillas ALopez De Ipina Gonzalez De Artaza D(2021)A Systematic Literature Review of Artificial Intelligence in Fashion Retail B2C2021 6th International Conference on Smart and Sustainable Technologies (SpliTech)10.23919/SpliTech52315.2021.9566467(01-06)Online publication date: 8-Sep-2021
https://doi.org/10.23919/SpliTech52315.2021.9566467

Index Terms

Measuring operational quality of recommendations: industry talk abstract
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Recommender systems

Recommendations

High quality recommendations for small communities: the case of a regional parent network
RecSys '12: Proceedings of the sixth ACM conference on Recommender systems

Traditional recommender systems are well established in scenarios in which "enough"items, users and ratings are available for the algorithms to operate on. However, automatic recommendations are also desirable in smaller online communities which only ...
Heterogeneous recommendations: what you might like to read after watching interstellar

Recommenders, as widely implemented nowadays by major e-commerce players like Netflix or Amazon, use collaborative filtering to suggest the most relevant items to their users. Clearly, the effectiveness of recommenders depends on the data they can ...
User Similarity Adjustment for Improved Recommendations
MIKE 2015: Proceedings of the Third International Conference on Mining Intelligence and Knowledge Exploration - Volume 9468

Recommender systems are becoming more and more attractive in both research and commercial communities due to Information overload problem and the popularity of the Internet applications. Collaborative Filtering, a popular branch of recommendation ...

Comments

Information & Contributors

Information

Published In

RecSys '18: Proceedings of the 12th ACM Conference on Recommender Systems

September 2018

600 pages

ISBN:9781450359016

DOI:10.1145/3240323

General Chairs:
Sole Pera
Boise State University
,
Michael Ekstrand
Boise State University
,
Program Chairs:
Xavier Amatriain
Curai, USA
,
John O'Donovan
University of California, Santa Barbara

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 September 2018

Check for updates

Author Tags

Qualifiers

Invited-talk

Conference

RecSys '18

Sponsor:

SIGCHI

RecSys '18: Twelfth ACM Conference on Recommender Systems

October 2, 2018

British Columbia, Vancouver, Canada

Acceptance Rates

RecSys '18 Paper Acceptance Rate 32 of 181 submissions, 18%;

Overall Acceptance Rate 254 of 1,295 submissions, 20%

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
139
Total Downloads

Downloads (Last 12 months)8
Downloads (Last 6 weeks)1

Reflects downloads up to 17 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Goti AQuerejeta-Lomas LAlmeida Ade la Puerta JLópez-de-Ipiña D(2023)Artificial Intelligence in Business-to-Customer Fashion Retail: A Literature ReviewMathematics10.3390/math1113294311:13(2943)Online publication date: 30-Jun-2023
https://doi.org/10.3390/math11132943
Querejeta Lomas LGoti Elordi AAlmeida Escondrillas ALopez De Ipina Gonzalez De Artaza D(2021)A Systematic Literature Review of Artificial Intelligence in Fashion Retail B2C2021 6th International Conference on Smart and Sustainable Technologies (SpliTech)10.23919/SpliTech52315.2021.9566467(01-06)Online publication date: 8-Sep-2021
https://doi.org/10.23919/SpliTech52315.2021.9566467

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Index Terms

Recommendations

High quality recommendations for small communities: the case of a regional parent network

Heterogeneous recommendations: what you might like to read after watching interstellar

User Similarity Adjustment for Improved Recommendations