research-article

Outlier Detection for Streaming Task Assignment in Crowdsourcing

Authors:

Christian S. JensenAuthors Info & Claims

WWW '22: Proceedings of the ACM Web Conference 2022

Pages 1933 - 1943

https://doi.org/10.1145/3485447.3512067

Published: 25 April 2022 Publication History

Abstract

Crowdsourcing aims to enable the assignment of available resources to the completion of tasks at scale. The continued digitization of societal processes translates into increased opportunities for crowdsourcing. For example, crowdsourcing enables the assignment of computational resources of humans, called workers, to tasks that are notoriously hard for computers. In settings faced with malicious actors, detection of such actors holds the potential to increase the robustness of crowdsourcing platform. We propose a framework called Outlier Detection for Streaming Task Assignment that aims to improve robustness by detecting malicious actors. In particular, we model the arrival of workers and the submission of tasks as evolving time series and provide means of detecting malicious actors by means of outlier detection. We propose a novel socially aware Generative Adversarial Network (GAN) based architecture that is capable of contending with the complex distributions found in time series. The architecture includes two GANs that are designed to adversarially train an autoencoder to learn the patterns of distributions in worker and task time series, thus enabling outlier detection based on reconstruction errors. A GAN structure encompasses a game between a generator and a discriminator, where it is desirable that the two can learn to coordinate towards socially optimal outcomes, while avoiding being exploited by selfish opponents. To this end, we propose a novel training approach that incorporates social awareness into the loss functions of the two GANs. Additionally, to improve task assignment efficiency, we propose an efficient greedy algorithm based on degree reduction that transforms task assignment into a bipartite graph matching. Extensive experiments offer insight into the effectiveness and efficiency of the proposed framework.

References

[1]

Samet Akcay, Amir Atapour-Abarghouei, and Toby P Breckon. 2018. Ganomaly: Semi-supervised anomaly detection via adversarial training. In ACCV. 622–637.

[2]

Xuanhao Chen, Liwei Deng, Feiteng Huang, Chengwei Zhang, Zongquan Zhang, Yan Zhao, and Kai Zheng. 2021. DAEMON: Unsupervised Anomaly Detection and Interpretation for Multivariate Time Series. In ICDE. 2225–2230.

[3]

Peng Cheng, Xiang Lian, Lei Chen, and Cyrus Shahabi. 2017. Prediction-Based Task Assignment in Spatial Crowdsourcing. In ICDE. 997–1008.

[4]

Peng Cheng, Xiang Lian, Zhao Chen, Rui Fu, Lei Chen, Jinsong Han, and Jizhong Zhao. 2015. Reliable Diversity-based Spatial Crowdsourcing by Moving Workers. VLDBJ 8, 10 (2015), 1022–1033.

Digital Library

[5]

Yue Cui, Liwei Deng, Yan Zhao, Bin Yao, Vincent W Zheng, and Kai Zheng. [n. d.]. Hidden poi ranking with spatial crowdsourcing. In KDD.

[6]

C. Eickhoff and A. de Vries. 2011. How crowdsourcable is your task. In WSDM. 11–14.

[7]

Lee Erickson, Irene Petrick, and Eileen Trauth. 2012. Hanging with the right crowd: Matching crowdsourcing need to crowd characteristics. In AMCIS. 1–9.

[8]

Ujwal Gadiraju, Ricardo Kawase, Stefan Dietze, and Gianluca Demartini. 2015. Understanding malicious behavior in crowdsourcing platforms: The case of online surveys. In CHI. 1631–1640.

[9]

R. Gennaro, C. Gentry, and B. Parno. 2010. Non-interactive verifiable computing: Outsourcing computation to untrusted workers. In CRYPTO. 465–482.

[10]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. NIPS 27(2014), 2672–2680.

[11]

Chien-Ju Ho and Jennifer Vaughan. 2012. Online task assignment in crowdsourcing markets. In AAAI. 45–51.

[12]

Kyle Hundman, Valentino Constantinou, Christopher Laporte, Ian Colwell, and Tom Soderstrom. 2018. Detecting spacecraft anomalies using lstms and nonparametric dynamic thresholding. In SIGKDD. 387–395.

[13]

P. G. Ipeirotis, F. Provost, and J. Wang. 2010. Quality management on amazon mechanical turk. In SIGMOD Workshops. 64–67.

[14]

Tung Kieu, Bin Yang, Chenjuan Guo, Razvan-Gabriel Cirstea, Yan Zhao, Yale Song, and Christian S. Jensen. 2022. Anomaly Detection in Time Series with Robust Variational Quasi-Recurrent Autoencoders. In ICDE.

[15]

Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114(2013).

[16]

Harold W Kuhn. 1955. The Hungarian method for the assignment problem. Naval research logistics quarterly 2, 1-2 (1955), 83–97.

[17]

Xiang Li, Yan Zhao, Jiannan Guo, and Kai Zheng. 2020. Group task assignment with social impact-based preference in spatial crowdsourcing. In DASFAA. 677–693.

[18]

Xiang Li, Yan Zhao, Xiaofang Zhou, and Kai Zheng. 2020. Consensus-Based Group Task Assignment with Social Impact in Spatial Crowdsourcing. Data Science and Engineering 5, 4 (2020), 375–390.

[19]

Pankaj Malhotra, Anusha Ramakrishnan, Gaurangi Anand, Lovekesh Vig, Puneet Agarwal, and Gautam Shroff. 2016. LSTM-based encoder-decoder for multi-sensor anomaly detection. arXiv preprint arXiv:1607.00148(2016).

[20]

David Oleson, Alexander Sorokin, Greg Laughlin, Vaughn Hester, John Le, and Lukas Biewald. 2011. Programmatic gold: Targeted and scalable quality assurance in crowdsourcing. In AAAI Workshops. 43–48.

[21]

Pranav Rajpurkar, Awni Y Hannun, Masoumeh Haghpanahi, Codie Bourn, and Andrew Y Ng. 2017. Cardiologist-level arrhythmia detection with convolutional neural networks. arXiv preprint arXiv:1707.01836(2017).

[22]

Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. 2016. Improved techniques for training gans. arXiv preprint arXiv:1606.03498(2016).

[23]

Sreelekshmy Selvin, R Vinayakumar, EA Gopalakrishnan, Vijay Krishna Menon, and KP Soman. 2017. Stock price prediction using LSTM, RNN and CNN-sliding window model. In ICACCI. 1643–1647.

[24]

John Sipple. 2020. Interpretable, multidimensional, multimodal anomaly detection with negative sampling for detection of device failure. In ICML. 9016–9025.

[25]

Yongxin Tong, Jieying She, Bolin Ding, and Libin Wang. 2016. Online Mobile Micro-Task Allocation in Spatial Crowdsourcing. In ICDE. 49–60.

[26]

Yongxin Tong, Libin Wang, Zimu Zhou, Bolin Ding, Lei Chen, Jieping Ye, and Ke Xu. 2017. Flexible Online Task Assignment in Real-time Spatial Data. PVLDB 10, 11 (2017), 1334–1345.

Digital Library

[27]

Yongxin Tong, Zimu Zhou, Yuxiang Zeng, Lei Chen, and Cyrus Shahabi. 2019. Spatial Crowdsourcing: A Survey. VLDBJ 29, 1 (2019), 217–250.

Digital Library

[28]

Jing Wang, Panagiotis G Ipeirotis, and Foster Provost. 2011. Managing crowdsourcing workers. In Conference on Business Intelligence. 10–12.

[29]

Ziwei Wang, Yan Zhao, Xuanhao Chen, and Kai Zheng. 2021. Task Assignment with Worker Churn Prediction in Spatial Crowdsourcing. In CIKM.

[30]

Jinfu Xia, Yan Zhao, Guanfeng Liu, Jiajie Xu, Min Zhang, and Kai Zheng. 2019. Profit-driven Task Assignment in Spatial Crowdsourcing. In IJCAI. 1914–1920.

[31]

Yan Xia, Xudong Cao, Fang Wen, Gang Hua, and Jian Sun. 2015. Learning discriminative reconstructions for unsupervised outlier removal. In ICCV. 1511–1519.

[32]

Guanyu Ye, Yan Zhao, Xuanhao Chen, and Kai Zheng. 2021. Task Allocation with Geographic Partition in Spatial Crowdsourcing. In CIKM.

[33]

Yan Zhao, Jiannan Guo, Xuanhao Chen, Jianye Hao, Xiaofang Zhou, and Kai Zheng. 2021. Coalition-based task assignment in spatial crowdsourcing. In ICDE. 241–252.

[34]

Yan Zhao, Yang Li, Yu Wang, Han Su, and Kai Zheng. 2017. Destination-aware Task Assignment in Spatial Crowdsourcing. In CIKM. 297–306.

[35]

Yan Zhao, Jinfu Xia, Guanfeng Liu, Han Su, Defu Lian, Shuo Shang, and Kai Zheng. 2019. Preference-aware task assignment in spatial crowdsourcing. In AAAI. 2629–2636.

[36]

Yan Zhao, Kai Zheng, Yue Cui, Han Su, Feida Zhu, and Xiaofang Zhou. 2020. Predictive task assignment in spatial crowdsourcing: a data-driven approach. In ICDE. 13–24.

[37]

Yan Zhao, Kai Zheng, Jiannan Guo, Bin Yang, Torben Bach Pedersen, and Christian S Jensen. 2021. Fairness-aware Task Assignment in Spatial Crowdsourcing: Game-Theoretic Approaches. In ICDE. 265–276.

[38]

Yan Zhao, Kai Zheng, Yang Li, Han Su, Jiajun Liu, and Xiaofang Zhou. 2019. Destination-aware Task Assignment in Spatial Crowdsourcing: A Worker Decomposition Approach. TKDE (2019), 2336–2350.

[39]

Yan Zhao, Kai Zheng, Hongzhi Yin, Guanfeng Liu, Junhua Fang, and Xiaofang Zhou. 2020. Preference-aware task assignment in spatial crowdsourcing: from individuals to groups. TKDE (2020).

[40]

Yudian Zheng, Jiannan Wang, Guoliang Li, Reynold Cheng, and Jianhua Feng. 2015. QASCA: A quality-aware task assignment system for crowdsourcing applications. In SIGMOD. 1031–1046.

Digital Library

[41]

Bin Zhou, Shenghua Liu, Bryan Hooi, Xueqi Cheng, and Jing Ye. 2019. BeatGAN: Anomalous Rhythm Detection using Adversarially Generated Time Series. In IJCAI. 4433–4439.

Cited By

Liu CXiao ZLong WLi TJiang HLi K(2025)Vehicle Trajectory Data Processing, Analytics, and Applications: A SurveyACM Computing Surveys10.1145/3715902Online publication date: 6-Mar-2025
https://doi.org/10.1145/3715902
Miao HZhao YGuo CYang BZheng KJensen C(2025)Spatio-Temporal Prediction on Streaming Data: A Unified Federated Continuous Learning FrameworkIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2025.352887637:4(2126-2140)Online publication date: Apr-2025
https://doi.org/10.1109/TKDE.2025.3528876
Qiu ZXie ZJi ZMao YCheng K(2025)HGSMAP: a novel heterogeneous graph-based associative percept framework for scenario-based optimal model assignmentKnowledge and Information Systems10.1007/s10115-024-02251-y67:1(915-952)Online publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1007/s10115-024-02251-y
Show More Cited By

Index Terms

Outlier Detection for Streaming Task Assignment in Crowdsourcing

Index terms have been assigned to the content through auto-classification.

Recommendations

Crowdsourcing usage, task assignment methods, and crowdsourcing platforms: A systematic literature review
Abstract
Crowdsourcing is simply the outsourcing of different tasks or work to a diverse group of individuals in an open call for the purpose of utilizing human intelligence. Crowdsourcing nowadays used to support and enhance software engineering in ...
A workload-dependent task assignment policy for crowdsourcing

Crowdsourcing marketplaces have emerged as an effective tool for high-speed, low-cost labeling of massive data sets. Since the labeling accuracy can greatly vary from worker to worker, we are faced with the problem of assigning labeling tasks to workers ...
Quality-aware Online Task Assignment in Mobile Crowdsourcing

In recent years, mobile crowdsourcing has emerged as a powerful computation paradigm to harness human power to perform spatial tasks such as collecting real-time traffic information and checking product prices in a specific supermarket. A fundamental ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WWW '22: Proceedings of the ACM Web Conference 2022

April 2022

3764 pages

ISBN:9781450390965

DOI:10.1145/3485447

Editors:
Frédérique Laforest
INSA Lyon, France
,
Raphaël Troncy
EURECOM, France
,
Elena Simperl
King’s College London, UK
,
Deepak Agarwal
Pinterest, USA
,
Aristides Gionis
KTH Royal Institute of Technology, Sweden
,
Ivan Herman
W3C / retired
,
Lionel Médini
Université Lyon 1, France

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 April 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

WWW '22

Sponsor:

SIGWEB

WWW '22: The ACM Web Conference 2022

April 25 - 29, 2022

Virtual Event, Lyon, France

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

19
Total Citations
View Citations
411
Total Downloads

Downloads (Last 12 months)68
Downloads (Last 6 weeks)10

Reflects downloads up to 08 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Liu CXiao ZLong WLi TJiang HLi K(2025)Vehicle Trajectory Data Processing, Analytics, and Applications: A SurveyACM Computing Surveys10.1145/3715902Online publication date: 6-Mar-2025
https://doi.org/10.1145/3715902
Miao HZhao YGuo CYang BZheng KJensen C(2025)Spatio-Temporal Prediction on Streaming Data: A Unified Federated Continuous Learning FrameworkIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2025.352887637:4(2126-2140)Online publication date: Apr-2025
https://doi.org/10.1109/TKDE.2025.3528876
Qiu ZXie ZJi ZMao YCheng K(2025)HGSMAP: a novel heterogeneous graph-based associative percept framework for scenario-based optimal model assignmentKnowledge and Information Systems10.1007/s10115-024-02251-y67:1(915-952)Online publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1007/s10115-024-02251-y
Qiu XHu JZhou LWu XDu JZhang BGuo CZhou AJensen CSheng ZYang B(2024)TFB: Towards Comprehensive and Fair Benchmarking of Time Series Forecasting MethodsProceedings of the VLDB Endowment10.14778/3665844.366586317:9(2363-2377)Online publication date: 6-Aug-2024
https://doi.org/10.14778/3665844.3665863
Wang JZhao DZhao G(2024)Malicious Participants and Fake Task Detection Incorporating Gaussian BiasACM Transactions on Internet Technology10.1145/369641924:4(1-19)Online publication date: 19-Sep-2024
https://dl.acm.org/doi/10.1145/3696419
Voleti STathipamula HPoolla LNair L(2024)A Predictive Framework for Failure Detection and Reallocation in Crowdsourcing PlatformsProceedings of the 2024 Sixteenth International Conference on Contemporary Computing10.1145/3675888.3676114(567-576)Online publication date: 8-Aug-2024
https://dl.acm.org/doi/10.1145/3675888.3676114
Qiu ZXie ZJi ZMao YCheng K(2024)SMAP: A Novel Heterogeneous Information Framework for Scenario-based Optimal Model AssignmentProceedings of the 2024 9th International Conference on Machine Learning Technologies10.1145/3674029.3674065(221-231)Online publication date: 24-May-2024
https://dl.acm.org/doi/10.1145/3674029.3674065
Xu HWang YJian SLiao QWang YPang G(2024)Calibrated One-Class Classification for Unsupervised Time Series Anomaly DetectionIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.339399636:11(5723-5736)Online publication date: Nov-2024
https://doi.org/10.1109/TKDE.2024.3393996
Ding XGuo JSun GLi D(2024)Optimizing Worker Selection in Collaborative Mobile CrowdsourcingIEEE Internet of Things Journal10.1109/JIOT.2023.331528811:4(7172-7185)Online publication date: 15-Feb-2024
https://doi.org/10.1109/JIOT.2023.3315288
Fang YXie JZhao YChen LGao YZheng K(2024)Temporal-Frequency Masked Autoencoders for Time Series Anomaly Detection2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00099(1228-1241)Online publication date: 13-May-2024
https://doi.org/10.1109/ICDE60146.2024.00099
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten