Abstract
Human annotated data is a prerequisite for the training and evaluation of computer vision algorithms. Such data is referred to as “ground truth” data. In this chapter, we describe the strategies and systems we have devised in order to obtain the ground truth data required by each of the computer vision components within the Fish4Knowledge system, including fish detection, tracking, and recognition.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
A case is a \(\{\textit{image, expert label}\}\) pair, thus 190 \(\times \) 3 cases in total.
- 3.
When there exist multiple labels for an image assigned by one expert, we randomly draw one of them to be evaluated; we repeat this process 100 times and report the averaged \(\kappa \) and its standard deviation over the 100 runs. Agreement calculated in this way is rather conservative.
- 4.
A case is a \(\{\textit{image(species), user}\}\) pair.
References
Boom, B.J., P.X. Huang, and R.B. Fisher. 2013. Approximate nearest neighbor search to support manual image annotation of large domain-specific datasets. In Proceedings of the international workshop on video and image ground truth in computer vision applications, VIGTA ’13, 4:1–4:8. New York: ACM.
Boom, B., P. Huang, J. He, and R.B. Fisher. 2012. Supporting ground-truth annotation of image datasets using clustering. In Proceedings of the 21st International Conference on Pattern Recognition (ICPR), 1542–1545. IEEE.
Brabham, D.C. 2008. Crowdsourcing as a model for problem solving an introduction and cases. Convergence: the international journal of research into new media technologies 14(1): 75–90.
Chen, Y.-Y., W.H. Hsu, and H.-Y.M. Liao. 2011. Learning facial attributes by crowdsourcing in social media. In WWW’11, 25–26.
Chin, J.P., V.A. Diehl, and K.L. Norman. 1988. Development of an instrument measuring user satisfaction of the human-computer interface. In Proceedings of the SIGCHI conference on Human factors in computing systems, 213–218. ACM.
Cooper, S., F. Khatib, A. Treuille, J. Barbero, J. Lee, M. Beene, A. Leaver-Fay, D. Baker, and Z.P. Foldit players. 2010. Predicting protein structures with a multiplayer online game. Nature 756–760.
Frey, B.J., and D. Dueck. 2007. Clustering by passing messages between data points. Science 315(5814): 972–976.
Gionis, A., P. Indyk, R. Motwani, et al. 1999. Similarity search in high dimensions via hashing. In In VLDB, vol. 99, 518–529.
Goldberger, J., S. Gordon, and H. Greenspan. 2006. Unsupervised image-set clustering using an information theoretic framework. IEEE Transactions on Image Processing 15(2): 449–458.
He, J., J. van Ossenbruggen, and A.P. de Vries. 2013a. Do you need experts in the crowd?: A case study in image annotation for marine biology. In OAIR’13, 57–60.
He, J., J. van Ossenbruggen, and A.P. de Vries. 2013b. Fish4label: accomplishing an expert task without expert knowledge. In OAIR’13, 211–212.
Howe, J. 2006. The rise of crowdsourcing. Wired magazine 14(6): 1–4.
Järvelin, K., and J. Kekäläinen. 2002. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems 20(4): 422–446.
Kass, M., A. Witkin, and D. Terzopoulos. 1988. Snakes: Active contour models. International journal of computer vision 1(4): 321–331.
Kavasidis, I., S. Palazzo, R. Di Salvo, D. Giordano, and C. Spampinato. 2012. A semi-automatic tool for detection and tracking ground truth generation in videos. Proceedings of the 1st international workshop on visual interfaces for ground truth collection in computer vision applications, VIGTA ’12, 6:1–6:5.
Kavasidis, I., S. Palazzo, R. Di Salvo, D. Giordano, and C. Spampinato. 2013a. An innovative web-based collaborative platform for video annotation. Multimedia Tools and Applications 1–20.
Kavasidis, I., C. Spampinato, and D. Giordano. 2013b. Generation of ground truth for object detection while playing an online game: Productive gaming or recreational working? In IEEE conference on computer vision and pattern recognition workshops (CVPRW), 694–699. IEEE.
Khattak, F.K., and A. Salleb-Aouissi. 2011. Quality control of crowd labeling through expert evaluation. In Second workshop on computational social science and the wisdom of crowds (NIPS 2011), 1–5.
Rother, C., V. Kolmogorov, and A. Blake. 2004. GrabCut: interactive foreground extraction using iterated graph cuts. ACM Transactions on Graphics 23(3): 309–314.
Russell, B.C., A. Torralba, K.P. Murphy, and W.T. Freeman. 2008. Labelme: A database and web-based tool for image annotation. International Journal of Computer Vision 77(1–3): 157–173.
Spampinato, C., B. Boom, and J. He (eds.). 2012a. VIGTA.
von Ahn, L., R. Liu, and M. Blum. 2006. Peekaboom: a game for locating objects in images. In CHI ’06, 55–64.
von Ahn, L., and L. Dabbish. 2004. Labeling images with a computer game. In CHI’04, 319–326.
Whitehill, J., P. Ruvolo, T. Wu, J. Bergsma, and J.R. Movellan. 2009. Whose vote should count more: Optimal integration of labels from labelers of unknown expertise. In NIPS, 2035–2043.
Yuen, J., B.C. Russell, C. Liu, and A. Torralba. 2009. Labelme video: Building a video database with human annotations. In ICCV, 1451.
Zooniverse_team (2014). Galaxy zoo. http://www.galaxyzoo.org/. Accessed 16 Nov 2014.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
He, J., Spampinato, C., Boom, B.J., Kavasidis, I. (2016). Data Groundtruthing and Crowdsourcing. In: Fisher, R., Chen-Burger, YH., Giordano, D., Hardman, L., Lin, FP. (eds) Fish4Knowledge: Collecting and Analyzing Massive Coral Reef Fish Video Data. Intelligent Systems Reference Library, vol 104. Springer, Cham. https://doi.org/10.1007/978-3-319-30208-9_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-30208-9_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-30206-5
Online ISBN: 978-3-319-30208-9
eBook Packages: EngineeringEngineering (R0)