research-article

A Crowdsourcing Repeated Annotations System for Visual Object Detection

Authors:
Yucheng Hu

School of Computer Science, Beijing University of Posts and Telecommunications, Beijing, China

School of Computer Science, Beijing University of Posts and Telecommunications, Beijing, China
View Profile

,
Zhonghong Ou

School of Computer Science, Beijing University of Posts and Telecommunications, Beijing, China

School of Computer Science, Beijing University of Posts and Telecommunications, Beijing, China
View Profile

,
Xiangyu Xu

School of Computer Science, Beijing University of Posts and Telecommunications, Beijing, China

School of Computer Science, Beijing University of Posts and Telecommunications, Beijing, China
View Profile

,
Meina Song

School of Computer Science, Beijing University of Posts and Telecommunications, Beijing, China

School of Computer Science, Beijing University of Posts and Telecommunications, Beijing, China
View Profile

ICVISP 2019: Proceedings of the 3rd International Conference on Vision, Image and Signal ProcessingAugust 2019Article No.: 14Pages 1–6https://doi.org/10.1145/3387168.3387242

Published:25 May 2020Publication History

ICVISP 2019: Proceedings of the 3rd International Conference on Vision, Image and Signal Processing

Pages 1–6

ABSTRACT

As a fundamental task in compute vision, object detection has been developed rapidly driven by the deep learning. The lack of a large number of images with ground truth annotations has become a chief obstacle to object detection applications in many fields. Eliciting labels from crowds is a potential way to obtain large labeled data. Nonetheless, existing crowdsourced techniques, e.g., Amazon Mechanical Turk (MTurk), often fail to guarantee the quality of the annotations, which have a bad influence on the accuracy of the deep detector. A variety of methods have been developed for ground truth inference and learning from crowds. In this paper, we study strategies to crowd-source repeated labels in support for these methods. The core challenge of building such a system is to reduce the difficulty to annotate multiple objects of interest and improve the data quality as much as possible. We present a system that adopts the turn-based annotation mechanism and consists of three simple sub-tasks: a single object annotation, a quality verification task and a coverage verification task. Experimental results demonstrate that our system is scalable, accurate and can assist the detector of obtaining higher accuracy.

References

Albarqouni, S., Baur, C., Achilles, F., Belagiannis, V., Demirci, S., & Navab, N. (2016). Aggnet: deep learning from crowds for mitosis detection in breast cancer histology images. IEEE transactions on medical imaging, 35(5), 1313--1321.Google Scholar
Aroyo, L., & Welty, C. (2015). Truth is a lie: Crowd truth and the seven myths of human annotation. AI Magazine, 36(1), 15--24.Google ScholarDigital Library
Basson, S., & Kanevsky, D. (2018). Crowdsourcing Training Data For Real-Time Transcription Models.Google Scholar
Bell, S., Upchurch, P., Snavely, N., & Bala, K. (2013). Opensurfaces: A richly annotated catalog of surface appearance. ACM Transactions on graphics (TOG), 32(4), 111.Google Scholar
Biewald, L., & Van Pelt, C. (2013). Crowdflower.Google Scholar
Dawid, A. P., & Skene, A. M. (1979). Maximum likelihood estimation of observer error-rates using the EM algorithm. Journal of the Royal Statistical Society: Series C (Applied Statistics), 28(1), 20--28.Google ScholarCross Ref
Everingham, M., Van Gool, L., Williams, C. K., Winn, J., & Zisserman, A. (2007). The PASCAL visual object classes challenge 2007 (VOC2007) results.Google Scholar
Girshick, R. (2015). Fast r-cnn. In Proceedings of the IEEE international conference on computer vision (pp. 1440--1448).Google ScholarDigital Library
Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 580--587).Google ScholarDigital Library
Gurari, D., Theriault, D., Sameki, M., Isenberg, B., Pham, T. A., Purwada, A., ... & Betke, M. (2015, January). How to collect segmentations for biomedical images? A benchmark evaluating the performance of experts, crowdsourced non-experts, and algorithms. In 2015 IEEE winter conference on applications of computer vision (pp. 1169--1176). IEEE.Google Scholar
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770--778).Google ScholarCross Ref
Howe, J. (2006). The rise of crowdsourcing. Wired magazine, 14(6), 1--4.Google Scholar
Hu, Y., & Song M. (2019). Crowd R-CNN: An Object Detection Model Utilizing Crowdsourced Labels. 2019 International Conference on Vision, Image and Signal Processing (ICVISP). In Press.Google Scholar
Inel, O., Khamkham, K., Cristea, T., Dumitrache, A., Rutjes, A., van der Ploeg, J., ... & Sips, R. J. (2014, October). Crowdtruth: Machine-human computation framework for harnessing disagreement in gathering annotated data. In International Semantic Web Conference (pp. 486--504). Springer, Cham.Google Scholar
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097--1105).Google ScholarDigital Library
Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems (pp. 91--99).Google ScholarDigital Library
Russell, B. C., Torralba, A., Murphy, K. P., & Freeman, W. T. (2008). LabelMe: a database and web-based tool for image annotation. International journal of computer vision, 77(1--3), 157--173.Google Scholar
Shao, S., Li, Z., Zhang, T., Peng, C., Yu, G., Zhang, X., ... & Sun, J. (2019). Objects365: A Large-Scale, High-Quality Dataset for Object Detection. In Proceedings of the IEEE International Conference on Computer Vision (pp. 8430--8439).Google ScholarCross Ref
Sheng, V. S., Provost, F., & Ipeirotis, P. G. (2008, August). Get another label? improving data quality and data mining using multiple, noisy labelers. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 614--622). ACM.Google ScholarDigital Library
Su, H., Deng, J., & Fei-Fei, L. (2012, July). Crowdsourcing annotations for visual object detection. In Workshops at the Twenty-Sixth AAAI Conference on Artificial Intelligence.Google Scholar
Turk, A. M. (2012). Amazon mechanical turk. Retrieved August, 17, 2012.Google Scholar
Von Ahn, L., & Dabbish, L. (2004, April). Labeling images with a computer game. In Proceedings of the SIGCHI conference on Human factors in computing systems (pp. 319--326). ACM.Google ScholarDigital Library
Von Ahn, L., Maurer, B., McMillen, C., Abraham, D., & Blum, M. (2008). recaptcha: Human-based character recognition via web security measures. Science, 321(5895), 1465--1468.Google ScholarCross Ref
Welinder, P., & Perona, P. (2010, June). Online crowdsourcing: rating annotators and obtaining cost-effective labels. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops (pp. 25--32). IEEE.Google Scholar
Welinder, P., Branson, S., Perona, P., & Belongie, S. J. (2010). The multidimensional wisdom of crowds. In Advances in neural information processing systems (pp. 2424--2432).Google Scholar

Index Terms

A Crowdsourcing Repeated Annotations System for Visual Object Detection

Recommendations

Crowd R-CNN: An Object Detection Model Utilizing Crowdsourced Labels
ICVISP 2019: Proceedings of the 3rd International Conference on Vision, Image and Signal Processing

Accuracy of object detection has increased significantly in recent years because of the rapid development of deep learning techniques. Nevertheless, its applications in many fields are still limited, mainly due to the lack of large datasets, especially ...
Read More
How reliable are annotations via crowdsourcing: a study about inter-annotator agreement for multi-label image annotation
MIR '10: Proceedings of the international conference on Multimedia information retrieval

The creation of golden standard datasets is a costly business. Optimally more than one judgment per document is obtained to ensure a high quality on annotations. In this context, we explore how much annotations from experts differ from each other, how ...
Read More
Toward crowdsourcing micro-level behavior annotations: the challenges of interface, training, and generalization
IUI '14: Proceedings of the 19th international conference on Intelligent User Interfaces

Research that involves human behavior analysis usually requires laborious and costly efforts for obtaining micro-level behavior annotations on a large video corpus. With the emerging paradigm of crowdsourcing however, these efforts can be considerably ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICVISP 2019: Proceedings of the 3rd International Conference on Vision, Image and Signal Processing
August 2019
584 pages
ISBN:9781450376259
DOI:10.1145/3387168

Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 25 May 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Crowdsourcing
Image annotation
Large scale annotation
Object detection
Repeated labels
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
ICVISP 2019 Paper Acceptance Rate126of277submissions,45%Overall Acceptance Rate186of424submissions,44%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 119
  Total Downloads
- Downloads (Last 12 months)20
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A Crowdsourcing Repeated Annotations System for Visual Object Detection

ICVISP 2019: Proceedings of the 3rd International Conference on Vision, Image and Signal Processing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Crowd R-CNN: An Object Detection Model Utilizing Crowdsourced Labels

How reliable are annotations via crowdsourcing: a study about inter-annotator agreement for multi-label image annotation

Toward crowdsourcing micro-level behavior annotations: the challenges of interface, training, and generalization

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

A Crowdsourcing Repeated Annotations System for Visual Object Detection

ICVISP 2019: Proceedings of the 3rd International Conference on Vision, Image and Signal Processing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Crowd R-CNN: An Object Detection Model Utilizing Crowdsourced Labels

How reliable are annotations via crowdsourcing: a study about inter-annotator agreement for multi-label image annotation

Toward crowdsourcing micro-level behavior annotations: the challenges of interface, training, and generalization

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media