Abstract
Humans can intuitively understand the content of images, and often reach a consensus that some images are more difficult to visual search tasks than others. However, this is quite challenging for computers as it is a subjective task which may be influenced by human emotional factors. Instead of focusing on how the models make reactions on datasets, our method has a capability of assigning scores to samples respectively within a dataset that estimating the difficulty of visual search tasks. Our model shows better performance for predicting visual search difficulty scores of samples produced by human annotators in PASCAL VOC2012. Eventually, we demostrate with experiments that our method has an ability of selecting suitable samples to improve the performance of detectors in a semi-supervised task.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Arun, S.P.: Turning visual search time on its head. Vision Res. 74, 86–92 (2012)
Caruana, R.: Using the future to “sort out” the present: rankprop and multitask learning for medical risk evaluation. Proc. NIPS’ 8, 959–965 (1996)
Croux, C., Dehon, C.: Influence functions of the Spearman and Kendall correlation measures. Stat. Methods Appl. 19(4), 497–515 (2010)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Tudor Ionescu, R., Alexe, B., Leordeanu, M., Popescu, M., Papadopoulos, D.P., Ferrari, V.: How hard can it be? Estimating the difficulty of visual search in an image. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2157–2166 (2016)
Ketkar, N.: Introduction to PyTorch. Deep Learning with Python, pp. 195–208. Apress, Berkeley (2017). https://doi.org/10.1007/978-1-4842-2766-4_12
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 1097–1105 (2012)
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)
Shi, M., Ferrari, V.: Weakly supervised object localization using size estimates. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 105–121. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46454-1_7
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Soviany, P., Ionescu, R.T.: Optimizing the trade-off between single-stage and two-stage object detectors using image difficulty prediction. arXiv preprint arXiv:1803.08707 (2018)
Trick, L.M., Enns, J.T.: Lifespan changes in attention: the visual search task. Cogn. Dev. 13(3), 369–386 (1998)
Vijayanarasimhan, S., Grauman, K.: What’s it going to cost you?: predicting effort vs. informativeness for multi-label image annotations. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 2262–2269 (2009)
Wang, K., Yan, X., Zhang, D., Zhang, L., Lin, L.: Towards human-machine cooperation: self-supervised sample mining for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1605–1613 (2018)
Wolfe, J.M., Palmer, E.M., Horowitz, T.S.: Reaction time distributions constrain models of visual search. Vis. Res. 50(14), 1304–1311 (2010)
Acknowledgements
This research was partly supported by Open Fund of Key Laboratory of Urban Land Resources Monitoring and Simulation, Ministry of Land and Resources (KF-2018-03-065) and National Science Foundation, China (No. 61702226).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Ma, D., Zhang, H., Wu, H., Zhang, T., Sun, J. (2019). Estimating Difficulty Score of Visual Search in Images for Semi-supervised Object Detection. In: Ohara, K., Bai, Q. (eds) Knowledge Management and Acquisition for Intelligent Systems. PKAW 2019. Lecture Notes in Computer Science(), vol 11669. Springer, Cham. https://doi.org/10.1007/978-3-030-30639-7_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-30639-7_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30638-0
Online ISBN: 978-3-030-30639-7
eBook Packages: Computer ScienceComputer Science (R0)