Skip to main content

Assessing the Difficulty of Labelling an Instance in Crowdworking

  • Conference paper
  • First Online:
Book cover ECML PKDD 2020 Workshops (ECML PKDD 2020)

Abstract

In supervised machine learning solutions, obtaining labels for data is either expensive or labels are very difficult to come by. This has resulted in reliance on crowdworking for label acquisition. However, these labels come with a penalty of unreliability, which gives rise to the need for assessing the reliability of labels. One such assessment can be performed by determining the difficulty of the labeling task performed by the crowdworker. Assessing annotator stress levels while performing a task can be indicative of its difficulty. We propose a time series classification approach that learns on stress signals of crowdworkers to distinguish between easy and difficult exemplary instance-labeling tasks. To transfer this classifier for a labeling task of different nature, we propose two types of time series classification models: a global model trained on the data of all label annotators, and individual-centric models trained on the time series of each annotator. We incorporate this approach into an instance-labeling framework that encompasses one phase for learning on exemplary tasks and one phase for the characterization of unknown tasks. In other words, the model is trained on a data distribution and is then used for classifying data from another distribution. We show that the individual-centric models achieve better performance than their global counterparts on many occasions, and we report notable performance by the classification models overall.

N. Jambigi and T. Chanda—Equal contribution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In this work, we use the terms annotation and labeling interchangeably.

  2. 2.

    https://www.movisens.com/en/products/eda-and-activity-sensor-move-3/.

  3. 3.

    https://pyts.readthedocs.io/en/stable/generated/pyts.classification.{BOSSVS.html,WEASEL.html}.

  4. 4.

    https://scikit-learn.org/stable/modules/generated/sklearn.{svm.SVC.html,ensemble.RandomForestClassifier.html}.

References

  1. Aghabozorgi, S., Shirkhorshidi, A.S., Wah, T.Y.: Time-series clustering-a decade review. Inf. Syst. 53, 16–38 (2015)

    Article  Google Scholar 

  2. Agirre, E., et al.: Semeval-2015 task 2: semantic textual similarity, English, Spanish and pilot on interpretability. In: Proceedings of the 9th International Workshop on Semantic Evaluation, pp. 252–263 (2015)

    Google Scholar 

  3. Allahbakhsh, M., Benatallah, B., Ignjatovic, A., Motahari-Nezhad, H.R., Bertino, E., Dustdar, S.: Quality control in crowdsourcing systems: issues and directions. IEEE Internet Comput. 17(2), 76–81 (2013)

    Article  Google Scholar 

  4. Anthony, L., Carrington, P., Chu, P., Kidd, C., Lai, J., Sears, A.: Gesture dynamics: features sensitive to task difficulty and correlated with physiological sensors. Stress 1418(360), 312–316 (2011)

    Google Scholar 

  5. Bagnall, A., Lines, J., Bostrom, A., Large, J., Keogh, E.: The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Mining Knowl. Disc. 31(3), 606–660 (2016). https://doi.org/10.1007/s10618-016-0483-9

    Article  MathSciNet  Google Scholar 

  6. Chandler, J., Paolacci, G., Mueller, P.: Risks and rewards of crowdsourcing marketplaces. In: Michelucci, P. (ed.) Handbook of Human Computation, pp. 377–392. Springer, New York (2013). https://doi.org/10.1007/978-1-4614-8806-4_30

    Chapter  Google Scholar 

  7. Christ, M., Braun, N., Neuffer, J., Kempa-Liehr, A.W.: Time series feature extraction on basis of scalable hypothesis tests (tsfresh-a python package). Neurocomputing 307, 72–77 (2018)

    Article  Google Scholar 

  8. Critchley, H., Nagai, Y.: Electrodermal activity (EDA). Encycl. Behav. Med. 78, 666–669 (2013)

    Google Scholar 

  9. Dua, D., Graff, C.: UCI machine learning repository (2017). http://archive.ics.uci.edu/ml

  10. Döbler, A., Moczalla, G.: Design and evaluation of similarity assessment configurations with help of crowdsourcing and active learning. Master Thesis OVGU (2018)

    Google Scholar 

  11. Eickhoff, C., de Vries, A.P.: Increasing cheat robustness of crowdsourcing tasks. Inf. Retriev. 16(2), 121–137 (2013)

    Article  Google Scholar 

  12. Gadiraju, U.: Its Getting Crowded!. Improving the Effectiveness of Microtask Crowdsourcing, Gesellschaft für Informatik eV (2018)

    Google Scholar 

  13. Healey, J.A., Picard, R.W.: Detecting stress during real-world driving tasks using physiological sensors. IEEE Trans. Intell. Transp. Syst. 6(2), 156–166 (2005)

    Article  Google Scholar 

  14. Ipeirotis, P.G., Provost, F., Wang, J.: Quality management on amazon mechanical turk. In: Proceedings of the ACM SIGKDD Workshop on Human Computation, pp. 64–67 (2010)

    Google Scholar 

  15. Jagabathula, S., Subramanian, L., Venkataraman, A.: Identifying unreliable and adversarial workers in crowdsourced labeling tasks. J. Mach. Learn. Res. 18(1), 3233–3299 (2017)

    MathSciNet  MATH  Google Scholar 

  16. Kurve, A., Miller, D.J., Kesidis, G.: Multicategory crowdsourcing accounting for variable task difficulty, worker skill, and worker intention. IEEE Trans. Knowl. Data Eng. 27(3), 794–809 (2014)

    Article  Google Scholar 

  17. Luz, N., Silva, N., Novais, P.: A survey of task-oriented crowdsourcing. Artif. Intell. Rev. 44(2), 187–213 (2014). https://doi.org/10.1007/s10462-014-9423-5

    Article  Google Scholar 

  18. Paparrizos, J., Gravano, L.: k-shape: efficient and accurate clustering of time series. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pp. 1855–1870 (2015)

    Google Scholar 

  19. Räbiger, S., Gezici, G., Saygın, Y., Spiliopoulou, M.: Predicting worker disagreement for more effective crowd labeling. In: 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA), pp. 179–188. IEEE (2018)

    Google Scholar 

  20. Räbiger, S., Spiliopoulou, M., Saygın, Y.: How do annotators label short texts? Toward understanding the temporal dynamics of tweet labeling. Inf. Sci. 457, 29–47 (2018)

    Article  MathSciNet  Google Scholar 

  21. Schäfer, P.: Bag-of-SFA-symbols in vector space (boss vs) (2015)

    Google Scholar 

  22. Schäfer, P., Leser, U.: Fast and accurate time series classification with weasel. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pp. 637–646. ACM (2017)

    Google Scholar 

  23. Sheng, V.S., Provost, F., Ipeirotis, P.G.: Get another label? Improving data quality and data mining using multiple, noisy labelers. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 614–622 (2008)

    Google Scholar 

  24. Snow, R., O’connor, B., Jurafsky, D., Ng, A.Y.: Cheap and fast-but is it good? Evaluating non-expert annotations for natural language tasks. In: Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pp. 254–263 (2008)

    Google Scholar 

  25. Unnikrishnan, V., et al.: Entity-level stream classification: exploiting entity similarity to label the future observations referring to an entity. Int. J. Data Sci. Anal. 9(1), 1–15 (2019). https://doi.org/10.1007/s41060-019-00177-1

    Article  MathSciNet  Google Scholar 

  26. Welinder, P., Branson, S., Perona, P., Belongie, S.J.: The multidimensional wisdom of crowds. In: Advances in Neural Information Processing Systems, pp. 2424–2432 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Neetha Jambigi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jambigi, N., Chanda, T., Unnikrishnan, V., Spiliopoulou, M. (2020). Assessing the Difficulty of Labelling an Instance in Crowdworking. In: Koprinska, I., et al. ECML PKDD 2020 Workshops. ECML PKDD 2020. Communications in Computer and Information Science, vol 1323. Springer, Cham. https://doi.org/10.1007/978-3-030-65965-3_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-65965-3_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-65964-6

  • Online ISBN: 978-3-030-65965-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics