skip to main content
10.1145/3534678.3539152acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Open access

Vexation-Aware Active Learning for On-Menu Restaurant Dish Availability

Published: 14 August 2022 Publication History

Abstract

Here we leverage the power of the crowd: online users who are willing to answer questions about dish availability at restaurants visited. While motivated users are happy to contribute knowledge, they are much less likely to respond to "silly'' or embarrassing questions (e.g., "DoesPizza Hut serve pizza?'' or "DoesMike's Vegan Restaurant serve steak?'')
In this paper, we study the problem of Vexation-Aware Active Learning (VAAL), where judiciously selected questions are targeted towards improving restaurant-dish model prediction, subject to a limit on the percentage of "unsure'' answers or "dismissals'' (e.g., swiping the app closed) measuring user vexation. We formalize the selection problem as an integer program and solve it efficiently using a distributed solution that scales linearly with the number of candidate questions. Since our algorithm relies on an accurate estimation of the unsure-dismiss rate (UDR), we present a regression model that provides high-quality results compared to baselines including collaborative filtering. Finally, we demonstrate in a live system that our proposed VAAL strategy performs competitively against classical (margin-based) active learning approaches while reducing the UDR for the questions being asked.

References

[1]
Omar Alonso, Catherine C. Marshall, and Marc Najork. 2013. A Human-Centered Framework for Ensuring Reliability on Crowdsourced Labeling Tasks. In Human Computation and Crowdsourcing: Works in Progress and Demonstration Abstracts, An Adjunct to the Proceedings of the First AAAI Conference on Human Computation and Crowdsourcing, November 7--9, 2013, Palm Springs, CA, USA (AAAI Workshops), Vol. WS-13--18. AAAI . http://www.aaai.org/ocs/index.php/HCOMP/HCOMP13/paper/view/7487
[2]
David Applegate, Mateo Díaz, Oliver Hinder, Haihao Lu, Miles Lubin, Brendan O'Donoghue, and Warren Schudy. 2022. Practical Large-Scale Linear Programming using Primal-Dual Hybrid Gradient. arxiv: math.OC/2106.04756
[3]
Kalesha Bullard, Yannick Schroecker, and Sonia Chernova. 2019. Active Learning within Constrained Environments through Imitation of an Expert Questioner. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, August 10--16, 2019, Sarit Kraus (Ed.). ijcai.org, 2045--2052. https://doi.org/10.24963/ijcai.2019/283
[4]
Antonin Chambolle and Thomas Pock. 2011. A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging. Journal of Mathematical Imaging and Vision, Vol. 40, 1 (2011), 120--145. http://dblp.uni-trier.de/db/journals/jmiv/jmiv40.html#ChambolleP11
[5]
Wei Chu, Martin Zinkevich, Lihong Li, Achint Thomas, and Belle L. Tseng. 2011. Unbiased online active learning in data streams. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, USA, August 21--24, 2011, Chid Apté, Joydeep Ghosh, and Padhraic Smyth (Eds.). ACM, 195--203. https://doi.org/10.1145/2020408.2020444
[6]
Gui Citovsky, Giulia DeSalvo, Claudio Gentile, Lazaros Karydas, Anand Rajagopalan, Afshin Rostamizadeh, and Sanjiv Kumar. 2021. Batch Active Learning at Scale. Advances in Neural Information Processing Systems, Vol. 34 (2021).
[7]
Peng Dai, Jeffrey M. Rzeszotarski, Praveen Paritosh, and Ed H. Chi. 2015. And Now for Something Completely Different: Improving Crowdsourcing Workflows with Micro-Diversions. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing, CSCW 2015, Vancouver, BC, Canada, March 14 - 18, 2015, Dan Cosley, Andrea Forte, Luigina Ciolfi, and David McDonald (Eds.). ACM, 628--638. https://doi.org/10.1145/2675133.2675260
[8]
Pinar Donmez, Jaime G. Carbonell, and Paul N. Bennett. 2007. Dual Strategy Active Learning. In Machine Learning: ECML 2007, 18th European Conference on Machine Learning, Warsaw, Poland, September 17--21, 2007, Proceedings (Lecture Notes in Computer Science), Joost N. Kok, Jacek Koronacki, Ramó n Ló pez de Má ntaras, Stan Matwin, Dunja Mladenic, and Andrzej Skowron (Eds.), Vol. 4701. Springer, 116--127. https://doi.org/10.1007/978--3--540--74958--5_14
[9]
Pinar Donmez, Jaime G Carbonell, and Jeff Schneider. 2009. Efficiently learning the accuracy of labeling sources for selective sampling. In Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. 259--268.
[10]
Sheng-Jun Huang, Rong Jin, and Zhi-Hua Zhou. 2010. Active Learning by Querying Informative and Representative Examples. In Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6--9 December 2010, Vancouver, British Columbia, Canada, John D. Lafferty, Christopher K. I. Williams, John Shawe-Taylor, Richard S. Zemel, and Aron Culotta (Eds.). Curran Associates, Inc., 892--900. https://proceedings.neurips.cc/paper/2010/hash/5487315b1286f907165907aa8fc96619-Abstract.html
[11]
Sheng-Jun Huang, Jia-Lve Chen, Xin Mu, and Zhi-Hua Zhou. 2017. Cost-Effective Active Learning from Diverse Labelers. In IJCAI . 1879--1885.
[12]
Panagiotis G. Ipeirotis and Evgeniy Gabrilovich. 2014. Quizz: targeted crowdsourcing with a billion (potential) users. In 23rd International World Wide Web Conference, WWW '14, Seoul, Republic of Korea, April 7--11, 2014, Chin-Wan Chung, Andrei Z. Broder, Kyuseok Shim, and Torsten Suel (Eds.). ACM, 143--154. https://doi.org/10.1145/2566486.2567988
[13]
Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix Factorization Techniques for Recommender Systems. Computer, Vol. 42, 8 (Aug. 2009), 30--37.
[14]
Evgeny Krivosheev, Siarhei Bykau, Fabio Casati, and Sunil Prabhakar. 2020. Detecting and Preventing Confused Labels in Crowdsourced Data. Proc. VLDB Endow., Vol. 13, 11 (2020), 2522--2535. http://www.vldb.org/pvldb/vol13/p2522-krivosheev.pdf
[15]
Nikolaos Lagos, Salah Ait-Mokhtar, and Ioan Calapodescu. 2020. Point-Of-Interest Semantic Tag Completion in a Global Crowdsourced Search-and-Discovery Database. In ECAI 2020 - 24th European Conference on Artificial Intelligence, 29 August-8 September 2020, Santiago de Compostela, Spain, August 29 - September 8, 2020 - Including 10th Conference on Prestigious Applications of Artificial Intelligence (PAIS 2020) (Frontiers in Artificial Intelligence and Applications), Giuseppe De Giacomo, Alejandro Catalá, Bistra Dilkina, Michela Milano, Sené n Barro, Alberto Bugar'i n, and Jé rô me Lang (Eds.), Vol. 325. IOS Press, 2993--3000. https://doi.org/10.3233/FAIA200474
[16]
Steffen Rendle, Walid Krichene, Li Zhang, and John R. Anderson. 2020. Neural Collaborative Filtering vs. Matrix Factorization Revisited. In RecSys 2020: Fourteenth ACM Conference on Recommender Systems, Virtual Event, Brazil, September 22--26, 2020, Rodrygo L. T. Santos, Leandro Balby Marinho, Elizabeth M. Daly, Li Chen, Kim Falk, Noam Koenigstein, and Edleno Silva de Moura (Eds.). ACM, 240--248. https://doi.org/10.1145/3383313.3412488
[17]
Burr Settles. 2009. Active Learning Literature Survey . Computer Sciences Technical Report 1648. University of Wisconsin--Madison. http://axon.cs.byu.edu/ martinez/classes/778/Papers/settles.activelearning.pdf
[18]
Dominic Seyler, Mohamed Yahya, Klaus Berberich, and Omar Alonso. 2016. Automated question generation for quality control in human computation tasks. In Proceedings of the 8th ACM Conference on Web Science, WebSci 2016, Hannover, Germany, May 22--25, 2016, Wolfgang Nejdl, Wendy Hall, Paolo Parigi, and Steffen Staab (Eds.). ACM, 360--362. https://doi.org/10.1145/2908131.2908210
[19]
Victor S Sheng, Foster Provost, and Panagiotis G Ipeirotis. 2008. Get another label? improving data quality and data mining using multiple, noisy labelers. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining . 614--622.
[20]
Luis von Ahn and Laura Dabbish. 2008. Designing games with a purpose. Commun. ACM, Vol. 51, 8 (2008), 58--67. https://doi.org/10.1145/1378704.1378719
[21]
Chris Welty, Lora Aroyo, Flip Korn, Sara McCarthy, and Shubin Zhao. 2021. Rapid Instance-Level Knowledge Acquisition for Google Maps from Class-Level Common Sense. In Proceedings of HCOMP-2021 . AAAI.
[22]
Chris Welty, Lora Aroyo, Flip Korn, Sara M. McCarthy, and Shubin Zhao. 2022. Addressing Label Sparsity with Class-Level Common Sense for Google Maps. Frontiers Artif. Intell., Vol. 5 (2022).

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
August 2022
5033 pages
ISBN:9781450393850
DOI:10.1145/3534678
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 August 2022

Check for updates

Author Tags

  1. active learning
  2. crowdsourcing
  3. user-generated content

Qualifiers

  • Research-article

Conference

KDD '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 351
    Total Downloads
  • Downloads (Last 12 months)137
  • Downloads (Last 6 weeks)21
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media