Abstract
In supervised machine learning solutions, obtaining labels for data is either expensive or labels are very difficult to come by. This has resulted in reliance on crowdworking for label acquisition. However, these labels come with a penalty of unreliability, which gives rise to the need for assessing the reliability of labels. One such assessment can be performed by determining the difficulty of the labeling task performed by the crowdworker. Assessing annotator stress levels while performing a task can be indicative of its difficulty. We propose a time series classification approach that learns on stress signals of crowdworkers to distinguish between easy and difficult exemplary instance-labeling tasks. To transfer this classifier for a labeling task of different nature, we propose two types of time series classification models: a global model trained on the data of all label annotators, and individual-centric models trained on the time series of each annotator. We incorporate this approach into an instance-labeling framework that encompasses one phase for learning on exemplary tasks and one phase for the characterization of unknown tasks. In other words, the model is trained on a data distribution and is then used for classifying data from another distribution. We show that the individual-centric models achieve better performance than their global counterparts on many occasions, and we report notable performance by the classification models overall.
N. Jambigi and T. Chanda—Equal contribution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
In this work, we use the terms annotation and labeling interchangeably.
- 2.
- 3.
- 4.
References
Aghabozorgi, S., Shirkhorshidi, A.S., Wah, T.Y.: Time-series clustering-a decade review. Inf. Syst. 53, 16–38 (2015)
Agirre, E., et al.: Semeval-2015 task 2: semantic textual similarity, English, Spanish and pilot on interpretability. In: Proceedings of the 9th International Workshop on Semantic Evaluation, pp. 252–263 (2015)
Allahbakhsh, M., Benatallah, B., Ignjatovic, A., Motahari-Nezhad, H.R., Bertino, E., Dustdar, S.: Quality control in crowdsourcing systems: issues and directions. IEEE Internet Comput. 17(2), 76–81 (2013)
Anthony, L., Carrington, P., Chu, P., Kidd, C., Lai, J., Sears, A.: Gesture dynamics: features sensitive to task difficulty and correlated with physiological sensors. Stress 1418(360), 312–316 (2011)
Bagnall, A., Lines, J., Bostrom, A., Large, J., Keogh, E.: The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Mining Knowl. Disc. 31(3), 606–660 (2016). https://doi.org/10.1007/s10618-016-0483-9
Chandler, J., Paolacci, G., Mueller, P.: Risks and rewards of crowdsourcing marketplaces. In: Michelucci, P. (ed.) Handbook of Human Computation, pp. 377–392. Springer, New York (2013). https://doi.org/10.1007/978-1-4614-8806-4_30
Christ, M., Braun, N., Neuffer, J., Kempa-Liehr, A.W.: Time series feature extraction on basis of scalable hypothesis tests (tsfresh-a python package). Neurocomputing 307, 72–77 (2018)
Critchley, H., Nagai, Y.: Electrodermal activity (EDA). Encycl. Behav. Med. 78, 666–669 (2013)
Dua, D., Graff, C.: UCI machine learning repository (2017). http://archive.ics.uci.edu/ml
Döbler, A., Moczalla, G.: Design and evaluation of similarity assessment configurations with help of crowdsourcing and active learning. Master Thesis OVGU (2018)
Eickhoff, C., de Vries, A.P.: Increasing cheat robustness of crowdsourcing tasks. Inf. Retriev. 16(2), 121–137 (2013)
Gadiraju, U.: Its Getting Crowded!. Improving the Effectiveness of Microtask Crowdsourcing, Gesellschaft für Informatik eV (2018)
Healey, J.A., Picard, R.W.: Detecting stress during real-world driving tasks using physiological sensors. IEEE Trans. Intell. Transp. Syst. 6(2), 156–166 (2005)
Ipeirotis, P.G., Provost, F., Wang, J.: Quality management on amazon mechanical turk. In: Proceedings of the ACM SIGKDD Workshop on Human Computation, pp. 64–67 (2010)
Jagabathula, S., Subramanian, L., Venkataraman, A.: Identifying unreliable and adversarial workers in crowdsourced labeling tasks. J. Mach. Learn. Res. 18(1), 3233–3299 (2017)
Kurve, A., Miller, D.J., Kesidis, G.: Multicategory crowdsourcing accounting for variable task difficulty, worker skill, and worker intention. IEEE Trans. Knowl. Data Eng. 27(3), 794–809 (2014)
Luz, N., Silva, N., Novais, P.: A survey of task-oriented crowdsourcing. Artif. Intell. Rev. 44(2), 187–213 (2014). https://doi.org/10.1007/s10462-014-9423-5
Paparrizos, J., Gravano, L.: k-shape: efficient and accurate clustering of time series. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pp. 1855–1870 (2015)
Räbiger, S., Gezici, G., Saygın, Y., Spiliopoulou, M.: Predicting worker disagreement for more effective crowd labeling. In: 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA), pp. 179–188. IEEE (2018)
Räbiger, S., Spiliopoulou, M., Saygın, Y.: How do annotators label short texts? Toward understanding the temporal dynamics of tweet labeling. Inf. Sci. 457, 29–47 (2018)
Schäfer, P.: Bag-of-SFA-symbols in vector space (boss vs) (2015)
Schäfer, P., Leser, U.: Fast and accurate time series classification with weasel. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pp. 637–646. ACM (2017)
Sheng, V.S., Provost, F., Ipeirotis, P.G.: Get another label? Improving data quality and data mining using multiple, noisy labelers. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 614–622 (2008)
Snow, R., O’connor, B., Jurafsky, D., Ng, A.Y.: Cheap and fast-but is it good? Evaluating non-expert annotations for natural language tasks. In: Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pp. 254–263 (2008)
Unnikrishnan, V., et al.: Entity-level stream classification: exploiting entity similarity to label the future observations referring to an entity. Int. J. Data Sci. Anal. 9(1), 1–15 (2019). https://doi.org/10.1007/s41060-019-00177-1
Welinder, P., Branson, S., Perona, P., Belongie, S.J.: The multidimensional wisdom of crowds. In: Advances in Neural Information Processing Systems, pp. 2424–2432 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Jambigi, N., Chanda, T., Unnikrishnan, V., Spiliopoulou, M. (2020). Assessing the Difficulty of Labelling an Instance in Crowdworking. In: Koprinska, I., et al. ECML PKDD 2020 Workshops. ECML PKDD 2020. Communications in Computer and Information Science, vol 1323. Springer, Cham. https://doi.org/10.1007/978-3-030-65965-3_24
Download citation
DOI: https://doi.org/10.1007/978-3-030-65965-3_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-65964-6
Online ISBN: 978-3-030-65965-3
eBook Packages: Computer ScienceComputer Science (R0)