Skip to main content
Log in

Incorporating pixel proximity into answer aggregation for crowdsourced image segmentation

  • Regular Paper
  • Published:
CCF Transactions on Pervasive Computing and Interaction Aims and scope Submit manuscript

Abstract

The success of crowdsouring has been witnessed in handling a wide spectrum of application tasks. However, there remain technical challenges for dealing with complex tasks like image segmentation. One reason is that macro tasks usually have complex hidden internal structures, which are difficult to be captured and utilized. In this work, we are concerned with answer aggregation of crowdsourced image segmentation. Recent crowdsouring research on both general tasks and image segmentation ignores the hidden structure information inside a task. To fill the gap, we propose a method named CrowdSeg to aggregate the set of segmentations given by crowdsourcing workers so as to obtain satisfying image segmentation results. First, we propose a model based on a convolutional auto-encoder that can extract the proximity information of adjacent pixels and represent it as embedding features. Second, since each pixel could be background or object, we then cluster the pixels based on embedding features with k-means algorithm and determine whether the cluster is the object or background according to the result accepted by the majority. The proposed method outperforms the baselines on four real-world datasets of biomedical images, which are collected from workers of the real-world crowdsourcing platform. The experimental result shows that our method is more effective and stable than other methods. We have released the real-world datasets and source code for all experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Availability of data and material

Biomedical Image Library data are publicly available at http://www.cs.bu.edu/~betke/BiomedicalImageSegmentation.

Code availability

The code of baselines has been published previously (Zheng et al. 2017). The code of our proposed method CrowdSeg is available at https://github.com/yangyi19/CrowdSeg.git

Notes

  1. https://github.com/yangyi19/CrowdSeg.git

References

  • Aydin, B.I., Yilmaz, Y.S., Li, Y., Li, Q., Gao, J., Demirbas, M.: Crowdsourcing for multiple-choice question answering. In: Twenty-Sixth IAAI Conference (2014)

  • Brizan, D.G., Tansel, A.U.: A. survey of entity resolution and record linkage methodologies. Commun. IIMA 6(3), 5 (2006)

    Google Scholar 

  • Cabezas, F., Carlier, A., Charvillat, V., Salvador, A., Giro-i Nieto, X.: Quality control in crowdsourced object segmentation. In: 2015 IEEE International Conference on Image Processing (ICIP). IEEE, pp. 4243–4247 (2015)

  • Carlier, A., Charvillat, V., Salvador, A., Giro-i Nieto, X., Marques, O.: Click’n’cut: crowdsourced interactive segmentation with object candidates. In: Proceedings of the 2014 International ACM Workshop on Crowdsourcing for Multimedia, pp. 53–56 (2014)

  • Dawid, A.P., Skene, A.M.: Maximum likelihood estimation of observer error-rates using the EM algorithm. J. R. Stat. Soc. Ser. C (Appl. Stat.) 28(1), 20–28 (1979)

    Google Scholar 

  • Demartini, G., Difallah, D.E., Cudré-Mauroux, P.: Zencrowd: leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking. In: Proceedings of the 21st International Conference on World Wide Web, pp. 469–478 (2012)

  • Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B (Methodol.) 39(1), 1–22 (1977)

    MathSciNet  MATH  Google Scholar 

  • Fan, J., Li, G., Ooi, B.C., Tan, K.l., Feng, J.: icrowd: an adaptive crowdsourcing framework. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, pp. 1015–1030 (2015)

  • Gurari, D., Theriault, D., Sameki, M., Isenberg, B., Pham, T.A., Purwada, A., Solski, P., Walker, M., Zhang, C., Wong, J.Y., et al.: How to collect segmentations for biomedical images? A benchmark evaluating the performance of experts, crowdsourced non-experts, and algorithms. In: 2015 IEEE Winter Conference on Applications of Computer Vision. IEEE, pp. 1169–1176 (2015)

  • Gurari, D., Sameki, M., Wu, Z., Betke, M.: Mixing crowd and algorithm efforts to segment objects in biomedical images. In: Medical Image Computing and Computer Assisted Intervention Interactive Medical Image Computation Workshop (2016), pp. 1–8 (2016)

  • He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

  • He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)

  • Irshad, H., Montaser-Kouhsari, L., Waltz, G., Bucur, O., Nowak, J., Dong, F., Knoblauch, N.W., Beck, A.H.: Crowdsourcing image annotation for nucleus detection and segmentation in computational pathology: evaluating experts, automated methods, and the crowd. In: Pacific symposium on biocomputing Co-chairs, pp. 294–305. World Scientific (2014)

  • Jang, W.D., Kim, C.S.: Interactive image segmentation via backpropagating refinement scheme. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5297–5306 (2019)

  • Karger, D.R., Oh, S., Shah, D.: Iterative learning for reliable crowdsourcing systems. Neural Inf. Process. Syst. 24, 1953–1961 (2011)

    Google Scholar 

  • Kaspar, A., Patterson, G., Kim, C., Aksoy, Y., Matusik, W., Elgharib, M.: Crowd-guided ensembles: How can we choreograph crowd workers for video segmentation? In: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, pp. 1–12 (2018)

  • Kim, H.C., Ghahramani, Z.: Bayesian classifier combination. In: Artificial Intelligence and Statistics. PMLR, pp. 619–627 (2012)

  • Landman, B.A., Asman, A.J., Scoggins, A.G., Bogovic, J.A., Xing, F., Prince, J.L.: Robust statistical fusion of image labels. IEEE Trans. Med. Imaging 31(2), 512–522 (2011)

    Article  Google Scholar 

  • Lee, D., Das Sarma, A., Parameswaran, A.: Aggregating crowdsourced image segmentations. HCOMP (2018)

  • Li, Q., Li, Y., Gao, J., Su, L., Zhao, B., Demirbas, M., Fan, W., Han, J.: A confidence-aware approach for truth discovery on long-tail data. Proc. VLDB Endow. 8(4), 425–436 (2014a)

    Article  Google Scholar 

  • Li, Q., Li, Y., Gao, J., Zhao, B., Fan, W., Han, J.: Resolving conflicts in heterogeneous data by truth discovery and source reliability estimation. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pp. 1187–1198 (2014b)

  • Lin, C.H., Mausam, M., Weld, D.S.: Crowdsourcing control: moving beyond multiple choice. In: Workshops at the Twenty-Sixth AAAI Conference on Artificial Intelligence (2012)

  • Little, J., Abrams, A., Pless, R.: Tools for richer crowd source image annotations. In: 2012 IEEE Workshop on the Applications of Computer Vision (WACV). IEEE, pp. 369–374 (2012)

  • Liu, B.: Sentiment analysis and opinion mining. Synth. Lect. Hum. Lang. Technol. 5(1), 1–167 (2012)

    Article  Google Scholar 

  • Liu, Q., ICS, U., Peng, J., Ihler, A.: Variational inference for crowdsourcing. Sign 10, j2Mi (2012a)

  • Liu, X., Lu, M., Ooi, B.C., Shen, Y., Wu, S., Zhang, M.: Cdas: a crowdsourcing data analytics system. arXiv:1207.0143 (2012b)

  • Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)

  • Ma, F., Li, Y., Li, Q., Qiu, M., Gao, J., Zhi, S., Su, L., Zhao, B., Ji, H., Han, J.: Faitcrowd: fine grained truth discovery for crowdsourced data aggregation. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 745–754 (2015)

  • Natonek, E.: Fast range image segmentation for servicing robots. In: Proceedings. 1998 IEEE International Conference on Robotics and Automation (Cat. No. 98CH36146), vol. 1. IEEE, pp. 406–411 (1998)

  • Nguyen, A.T., Wallace, B.C., Li, J.J., Nenkova, A., Lease, M.: Aggregating and predicting sequence labels from crowd annotations. In: Proceedings of the Conference. Association for Computational Linguistics. Meeting, vol. 2017. NIH Public Access, p. 299 (2017)

  • Nye, B., Li, J.J., Patel, R., Yang, Y., Marshall, I.J., Nenkova, A., Wallace, B.C.: A corpus with multi-level annotations of patients, interventions and outcomes to support language processing for medical literature. In: Proceedings of the Conference. Association for Computational Linguistics. Meeting, vol. 2018. NIH Public Access, p. 197 (2018)

  • Paun, S., Chamberlain, J., Kruschwitz, U., Yu, J., Poesio, M.: A Probabilistic Annotation Model for Crowdsourcing Coreference (2020)

  • Raykar, V.C., Yu, S., Zhao, L.H., Valadez, G.H., Florin, C., Bogoni, L., Moy, L.: Learning from crowds. J. Mach. Learn. Res. 11(4), 1297–1322 (2010)

    MathSciNet  Google Scholar 

  • Rodrigues, F., Pereira, F., Ribeiro, B.: Sequence labeling with multiple annotators. Mach. Learn. 95(2), 165–181 (2014)

    Article  MathSciNet  Google Scholar 

  • Russakovsky, O., Li, L.J., Fei-Fei, L.: Best of both worlds: human-machine collaboration for object annotation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2121–2131 (2015)

  • Sameki, M., Gurari, D., Betke, M.: Characterizing image segmentation behavior of the crowd. In: Collective Intelligence, pp. 1–4 (2015)

  • Snow, R., Connor, B.O., Jurafsky, D., Ng, A.Y., Labs, D., St, C.: Cheap and fast—but is it good? Evaluation non-expert annotiations for natural language tasks (2008a)

  • Snow, R., O’connor, B., Jurafsky, D., Ng, A.Y.: Cheap and fast—but is it good? evaluating non-expert annotations for natural language tasks. In: Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pp. 254–263 (2008b)

  • Sofiiuk, K., Petrov, I., Barinova, O., Konushin, A.: f-brs: rethinking backpropagating refinement for interactive segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8623–8632 (2020)

  • Sorokin, A., Forsyth, D.: Utility data annotation with amazon mechanical turk. In: 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. IEEE, , pp. 1–8 (2008)

  • Torralba, A., Russell, B.C., Yuen, J.: Labelme: online image annotation and applications. Proc. IEEE 98(8), 1467–1484 (2010)

    Article  Google Scholar 

  • Venanzi, M., Guiver, J., Kazai, G., Kohli, P., Shokouhi, M.: Community-based Bayesian aggregation models for crowdsourcing. In: Proceedings of the 23rd International Conference on World Wide Web, pp. 155–164 (2014)

  • Vittayakorn, S., Hays, J.: Quality assessment for crowdsourced object annotations. In: BMVC, pp. 1–11 (2011)

  • Warfield, S.K., Zou, K.H., Wells, W.M.: Simultaneous truth and performance level estimation (staple): an algorithm for the validation of image segmentation. IEEE Trans. Med. Imaging 23(7), 903–921 (2004)

    Article  Google Scholar 

  • Welinder, P., Branson, S., Perona, P., Belongie, S.: The multidimensional wisdom of crowds. Adv. Neural Inf. Process. Syst. 23, 2424–2432 (2010)

    Google Scholar 

  • Whitehill, J., Wu, T.F., Bergsma, J., Movellan, J., Ruvolo, P.: Whose vote should count more: optimal integration of labels from labelers of unknown expertise. Adv. Neural Inf. Process. Syst. 22, 2035–2043 (2009)

    Google Scholar 

  • Yamaguchi, K., Kiapour, M.H., Ortiz, L.E., Berg, T.L.: Parsing clothing in fashion photographs. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp. 3570–3577 (2012)

  • Zaidan, O., Callison-Burch, C.: Crowdsourcing translation: professional quality from non-professionals. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 1220–1229 (2011)

  • Zheng, Y., Li, G., Li, Y., Shan, C., Cheng, R.: Truth inference in crowdsourcing: is the problem solved? Proc. VLDB Endow. 10(5), 541–552 (2017)

    Article  Google Scholar 

  • Zhou, D., Platt, J.C., Basu, S., Mao, Y.: Learning from the wisdom of crowds by minimax entropy (2012)

Download references

Funding

This work was supported by National Natural Science Foundation under Grant nos. (61932007, 61972013)

Author information

Authors and Affiliations

Authors

Contributions

YY: conceptualization, methodology, data curation, writing—original draft preparation. PC: conceptualization, methodology, writing-reviewing and editing. HS: conceptualization, methodology, writing-reviewing and editing, supervision, funding acquisition.

Corresponding author

Correspondence to Hailong Sun.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

Not applicable.

Consent to participate

Not applicable.

Consent for publication

All the authors agree the publication of this work.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, Y., Chen, P. & Sun, H. Incorporating pixel proximity into answer aggregation for crowdsourced image segmentation. CCF Trans. Pervasive Comp. Interact. 4, 172–187 (2022). https://doi.org/10.1007/s42486-022-00090-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42486-022-00090-w

Keywords

Navigation