skip to main content
10.1145/3485447.3512067acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Outlier Detection for Streaming Task Assignment in Crowdsourcing

Published: 25 April 2022 Publication History

Abstract

Crowdsourcing aims to enable the assignment of available resources to the completion of tasks at scale. The continued digitization of societal processes translates into increased opportunities for crowdsourcing. For example, crowdsourcing enables the assignment of computational resources of humans, called workers, to tasks that are notoriously hard for computers. In settings faced with malicious actors, detection of such actors holds the potential to increase the robustness of crowdsourcing platform. We propose a framework called Outlier Detection for Streaming Task Assignment that aims to improve robustness by detecting malicious actors. In particular, we model the arrival of workers and the submission of tasks as evolving time series and provide means of detecting malicious actors by means of outlier detection. We propose a novel socially aware Generative Adversarial Network (GAN) based architecture that is capable of contending with the complex distributions found in time series. The architecture includes two GANs that are designed to adversarially train an autoencoder to learn the patterns of distributions in worker and task time series, thus enabling outlier detection based on reconstruction errors. A GAN structure encompasses a game between a generator and a discriminator, where it is desirable that the two can learn to coordinate towards socially optimal outcomes, while avoiding being exploited by selfish opponents. To this end, we propose a novel training approach that incorporates social awareness into the loss functions of the two GANs. Additionally, to improve task assignment efficiency, we propose an efficient greedy algorithm based on degree reduction that transforms task assignment into a bipartite graph matching. Extensive experiments offer insight into the effectiveness and efficiency of the proposed framework.

References

[1]
Samet Akcay, Amir Atapour-Abarghouei, and Toby P Breckon. 2018. Ganomaly: Semi-supervised anomaly detection via adversarial training. In ACCV. 622–637.
[2]
Xuanhao Chen, Liwei Deng, Feiteng Huang, Chengwei Zhang, Zongquan Zhang, Yan Zhao, and Kai Zheng. 2021. DAEMON: Unsupervised Anomaly Detection and Interpretation for Multivariate Time Series. In ICDE. 2225–2230.
[3]
Peng Cheng, Xiang Lian, Lei Chen, and Cyrus Shahabi. 2017. Prediction-Based Task Assignment in Spatial Crowdsourcing. In ICDE. 997–1008.
[4]
Peng Cheng, Xiang Lian, Zhao Chen, Rui Fu, Lei Chen, Jinsong Han, and Jizhong Zhao. 2015. Reliable Diversity-based Spatial Crowdsourcing by Moving Workers. VLDBJ 8, 10 (2015), 1022–1033.
[5]
Yue Cui, Liwei Deng, Yan Zhao, Bin Yao, Vincent W Zheng, and Kai Zheng. [n. d.]. Hidden poi ranking with spatial crowdsourcing. In KDD.
[6]
C. Eickhoff and A. de Vries. 2011. How crowdsourcable is your task. In WSDM. 11–14.
[7]
Lee Erickson, Irene Petrick, and Eileen Trauth. 2012. Hanging with the right crowd: Matching crowdsourcing need to crowd characteristics. In AMCIS. 1–9.
[8]
Ujwal Gadiraju, Ricardo Kawase, Stefan Dietze, and Gianluca Demartini. 2015. Understanding malicious behavior in crowdsourcing platforms: The case of online surveys. In CHI. 1631–1640.
[9]
R. Gennaro, C. Gentry, and B. Parno. 2010. Non-interactive verifiable computing: Outsourcing computation to untrusted workers. In CRYPTO. 465–482.
[10]
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. NIPS 27(2014), 2672–2680.
[11]
Chien-Ju Ho and Jennifer Vaughan. 2012. Online task assignment in crowdsourcing markets. In AAAI. 45–51.
[12]
Kyle Hundman, Valentino Constantinou, Christopher Laporte, Ian Colwell, and Tom Soderstrom. 2018. Detecting spacecraft anomalies using lstms and nonparametric dynamic thresholding. In SIGKDD. 387–395.
[13]
P. G. Ipeirotis, F. Provost, and J. Wang. 2010. Quality management on amazon mechanical turk. In SIGMOD Workshops. 64–67.
[14]
Tung Kieu, Bin Yang, Chenjuan Guo, Razvan-Gabriel Cirstea, Yan Zhao, Yale Song, and Christian S. Jensen. 2022. Anomaly Detection in Time Series with Robust Variational Quasi-Recurrent Autoencoders. In ICDE.
[15]
Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114(2013).
[16]
Harold W Kuhn. 1955. The Hungarian method for the assignment problem. Naval research logistics quarterly 2, 1-2 (1955), 83–97.
[17]
Xiang Li, Yan Zhao, Jiannan Guo, and Kai Zheng. 2020. Group task assignment with social impact-based preference in spatial crowdsourcing. In DASFAA. 677–693.
[18]
Xiang Li, Yan Zhao, Xiaofang Zhou, and Kai Zheng. 2020. Consensus-Based Group Task Assignment with Social Impact in Spatial Crowdsourcing. Data Science and Engineering 5, 4 (2020), 375–390.
[19]
Pankaj Malhotra, Anusha Ramakrishnan, Gaurangi Anand, Lovekesh Vig, Puneet Agarwal, and Gautam Shroff. 2016. LSTM-based encoder-decoder for multi-sensor anomaly detection. arXiv preprint arXiv:1607.00148(2016).
[20]
David Oleson, Alexander Sorokin, Greg Laughlin, Vaughn Hester, John Le, and Lukas Biewald. 2011. Programmatic gold: Targeted and scalable quality assurance in crowdsourcing. In AAAI Workshops. 43–48.
[21]
Pranav Rajpurkar, Awni Y Hannun, Masoumeh Haghpanahi, Codie Bourn, and Andrew Y Ng. 2017. Cardiologist-level arrhythmia detection with convolutional neural networks. arXiv preprint arXiv:1707.01836(2017).
[22]
Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. 2016. Improved techniques for training gans. arXiv preprint arXiv:1606.03498(2016).
[23]
Sreelekshmy Selvin, R Vinayakumar, EA Gopalakrishnan, Vijay Krishna Menon, and KP Soman. 2017. Stock price prediction using LSTM, RNN and CNN-sliding window model. In ICACCI. 1643–1647.
[24]
John Sipple. 2020. Interpretable, multidimensional, multimodal anomaly detection with negative sampling for detection of device failure. In ICML. 9016–9025.
[25]
Yongxin Tong, Jieying She, Bolin Ding, and Libin Wang. 2016. Online Mobile Micro-Task Allocation in Spatial Crowdsourcing. In ICDE. 49–60.
[26]
Yongxin Tong, Libin Wang, Zimu Zhou, Bolin Ding, Lei Chen, Jieping Ye, and Ke Xu. 2017. Flexible Online Task Assignment in Real-time Spatial Data. PVLDB 10, 11 (2017), 1334–1345.
[27]
Yongxin Tong, Zimu Zhou, Yuxiang Zeng, Lei Chen, and Cyrus Shahabi. 2019. Spatial Crowdsourcing: A Survey. VLDBJ 29, 1 (2019), 217–250.
[28]
Jing Wang, Panagiotis G Ipeirotis, and Foster Provost. 2011. Managing crowdsourcing workers. In Conference on Business Intelligence. 10–12.
[29]
Ziwei Wang, Yan Zhao, Xuanhao Chen, and Kai Zheng. 2021. Task Assignment with Worker Churn Prediction in Spatial Crowdsourcing. In CIKM.
[30]
Jinfu Xia, Yan Zhao, Guanfeng Liu, Jiajie Xu, Min Zhang, and Kai Zheng. 2019. Profit-driven Task Assignment in Spatial Crowdsourcing. In IJCAI. 1914–1920.
[31]
Yan Xia, Xudong Cao, Fang Wen, Gang Hua, and Jian Sun. 2015. Learning discriminative reconstructions for unsupervised outlier removal. In ICCV. 1511–1519.
[32]
Guanyu Ye, Yan Zhao, Xuanhao Chen, and Kai Zheng. 2021. Task Allocation with Geographic Partition in Spatial Crowdsourcing. In CIKM.
[33]
Yan Zhao, Jiannan Guo, Xuanhao Chen, Jianye Hao, Xiaofang Zhou, and Kai Zheng. 2021. Coalition-based task assignment in spatial crowdsourcing. In ICDE. 241–252.
[34]
Yan Zhao, Yang Li, Yu Wang, Han Su, and Kai Zheng. 2017. Destination-aware Task Assignment in Spatial Crowdsourcing. In CIKM. 297–306.
[35]
Yan Zhao, Jinfu Xia, Guanfeng Liu, Han Su, Defu Lian, Shuo Shang, and Kai Zheng. 2019. Preference-aware task assignment in spatial crowdsourcing. In AAAI. 2629–2636.
[36]
Yan Zhao, Kai Zheng, Yue Cui, Han Su, Feida Zhu, and Xiaofang Zhou. 2020. Predictive task assignment in spatial crowdsourcing: a data-driven approach. In ICDE. 13–24.
[37]
Yan Zhao, Kai Zheng, Jiannan Guo, Bin Yang, Torben Bach Pedersen, and Christian S Jensen. 2021. Fairness-aware Task Assignment in Spatial Crowdsourcing: Game-Theoretic Approaches. In ICDE. 265–276.
[38]
Yan Zhao, Kai Zheng, Yang Li, Han Su, Jiajun Liu, and Xiaofang Zhou. 2019. Destination-aware Task Assignment in Spatial Crowdsourcing: A Worker Decomposition Approach. TKDE (2019), 2336–2350.
[39]
Yan Zhao, Kai Zheng, Hongzhi Yin, Guanfeng Liu, Junhua Fang, and Xiaofang Zhou. 2020. Preference-aware task assignment in spatial crowdsourcing: from individuals to groups. TKDE (2020).
[40]
Yudian Zheng, Jiannan Wang, Guoliang Li, Reynold Cheng, and Jianhua Feng. 2015. QASCA: A quality-aware task assignment system for crowdsourcing applications. In SIGMOD. 1031–1046.
[41]
Bin Zhou, Shenghua Liu, Bryan Hooi, Xueqi Cheng, and Jing Ye. 2019. BeatGAN: Anomalous Rhythm Detection using Adversarially Generated Time Series. In IJCAI. 4433–4439.

Cited By

View all
  • (2025)Vehicle Trajectory Data Processing, Analytics, and Applications: A SurveyACM Computing Surveys10.1145/3715902Online publication date: 6-Mar-2025
  • (2025)Spatio-Temporal Prediction on Streaming Data: A Unified Federated Continuous Learning FrameworkIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2025.352887637:4(2126-2140)Online publication date: Apr-2025
  • (2025)HGSMAP: a novel heterogeneous graph-based associative percept framework for scenario-based optimal model assignmentKnowledge and Information Systems10.1007/s10115-024-02251-y67:1(915-952)Online publication date: 1-Jan-2025
  • Show More Cited By

Index Terms

  1. Outlier Detection for Streaming Task Assignment in Crowdsourcing
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        WWW '22: Proceedings of the ACM Web Conference 2022
        April 2022
        3764 pages
        ISBN:9781450390965
        DOI:10.1145/3485447
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 25 April 2022

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. crowdsourcing
        2. outlier detection
        3. task assignment
        4. time series

        Qualifiers

        • Research-article
        • Research
        • Refereed limited

        Conference

        WWW '22
        Sponsor:
        WWW '22: The ACM Web Conference 2022
        April 25 - 29, 2022
        Virtual Event, Lyon, France

        Acceptance Rates

        Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)68
        • Downloads (Last 6 weeks)10
        Reflects downloads up to 08 Mar 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2025)Vehicle Trajectory Data Processing, Analytics, and Applications: A SurveyACM Computing Surveys10.1145/3715902Online publication date: 6-Mar-2025
        • (2025)Spatio-Temporal Prediction on Streaming Data: A Unified Federated Continuous Learning FrameworkIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2025.352887637:4(2126-2140)Online publication date: Apr-2025
        • (2025)HGSMAP: a novel heterogeneous graph-based associative percept framework for scenario-based optimal model assignmentKnowledge and Information Systems10.1007/s10115-024-02251-y67:1(915-952)Online publication date: 1-Jan-2025
        • (2024)TFB: Towards Comprehensive and Fair Benchmarking of Time Series Forecasting MethodsProceedings of the VLDB Endowment10.14778/3665844.366586317:9(2363-2377)Online publication date: 6-Aug-2024
        • (2024)Malicious Participants and Fake Task Detection Incorporating Gaussian BiasACM Transactions on Internet Technology10.1145/369641924:4(1-19)Online publication date: 19-Sep-2024
        • (2024)A Predictive Framework for Failure Detection and Reallocation in Crowdsourcing PlatformsProceedings of the 2024 Sixteenth International Conference on Contemporary Computing10.1145/3675888.3676114(567-576)Online publication date: 8-Aug-2024
        • (2024)SMAP: A Novel Heterogeneous Information Framework for Scenario-based Optimal Model AssignmentProceedings of the 2024 9th International Conference on Machine Learning Technologies10.1145/3674029.3674065(221-231)Online publication date: 24-May-2024
        • (2024)Calibrated One-Class Classification for Unsupervised Time Series Anomaly DetectionIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.339399636:11(5723-5736)Online publication date: Nov-2024
        • (2024)Optimizing Worker Selection in Collaborative Mobile CrowdsourcingIEEE Internet of Things Journal10.1109/JIOT.2023.331528811:4(7172-7185)Online publication date: 15-Feb-2024
        • (2024)Temporal-Frequency Masked Autoencoders for Time Series Anomaly Detection2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00099(1228-1241)Online publication date: 13-May-2024
        • Show More Cited By

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format.

        HTML Format

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media