Boosting the Performance of Object Detection CNNs with Context-Based Anomaly Detection

Blaha, Jan; Broughton, George; Krajník, Tomáš

doi:10.1007/978-3-030-67537-0_11

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 349))

Included in the following conference series:

International Conference on Collaborative Computing: Networking, Applications and Worksharing

1297 Accesses

Abstract

In this paper, we employ anomaly detection methods to enhance the ability of object detectors by using the context of their detections. This has numerous potential applications from boosting the performance of standard object detectors, to the preliminary validation of annotation quality, and even for robotic exploration and object search. We build our method on autoencoder networks for detecting anomalies, where we do not try to filter incoming data based on anomality score as is usual, but instead, we focus on the individual features of the data representing an actual scene. We show that one can teach autoencoders about the contextual relationship of objects in images, i.e. the likelihood of co-detecting classes in the same scene. This can then be used to identify detections that do and do not fit with the rest of the current observations in the scene. We show that the use of this information yields better results than using traditional thresholding when deciding if weaker detections are actually classed as observed or not. The experiments performed not only show that our method significantly improves the performance of CNN object detectors, but that it can be used as an efficient tool to discover incorrectly-annotated images.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

An, J., Cho, S.: Variational autoencoder based anomaly detection using reconstruction probability. Spec. Lect. IE 2(1), 1–18 (2015)
Google Scholar
Bell, S., Lawrence Zitnick, C., Bala, K., Girshick, R.: Inside-outside net: detecting objects in context with skip pooling and recurrent neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2874–2883 (2016)
Google Scholar
Biederman, I.: Recognition-by-components: a theory of human image understanding. Psychol. Rev. 94(2), 115 (1987)
Article Google Scholar
Bishop, C.M.: Pattern Recognition and Machine Learning. Information Science and Statistics. Springer, New York (2006)
MATH Google Scholar
Board, T.N.T.S.: Collision between vehicle controlled by developmental automated driving system and pedestrian, Tempe, Arizona, p. 78, 18 March 2018
Google Scholar
Chen, X., Gupta, A.: Spatial memory for context reasoning in object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4086–4096 (2017)
Google Scholar
Creswell, A., Arulkumaran, K., Bharath, A.A.: On denoising autoencoders trained to minimise binary cross-entropy. arXiv:1708.08487 [cs, stat] October 2017
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Google Scholar
Divvala, S.K., Hoiem, D., Hays, J.H., Efros, A.A., Hebert, M.: An empirical study of context in object detection. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1271–1278. IEEE (2009)
Google Scholar
Dutta, A., Zisserman, A.: The VGG image annotator (via). arXiv preprint arXiv:1904.10699 (2019)
Galleguillos, C., Belongie, S.: Context based object categorization: a critical survey. Comput. Vis. Image Understand. 114(6), 712–722 (2010)
Article Google Scholar
Gehring, J., Miao, Y., Metze, F., Waibel, A.: Extracting deep bottleneck features using stacked auto-encoders. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 3377–3381, May 2013. DOIurl10.1109/ICASSP.2013.6638284. ISSN: 2379-190X
Google Scholar
George Broughton, J.B.: Auto image anomaly (2020). https://github.com/broughtong/Auto-Image-Anomaly
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Hu, H., Gu, J., Zhang, Z., Dai, J., Wei, Y.: Relation networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3588–3597 (2018)
Google Scholar
Kramer, M.A.: Nonlinear principal component analysis using autoassociative neural networks. AIChE J. 37(2), 233–243 (1991). https://doi.org/10.1002/aic.690370209. https://aiche.onlinelibrary.wiley.com/doi/abs/10.1002/aic.690370209
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Lin, T.Y., et al.: Microsoft COCO: common objects in context. arXiv:1405.0312 [cs] February 2015
Lin, T.-Y.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Liu, Y., Wang, R., Shan, S., Chen, X.: Structure inference net: object detection using scene-level context and instance-level relationships. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6985–6994 (2018)
Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Google Scholar
Sakurada, M., Yairi, T.: Anomaly detection using autoencoders with nonlinear dimensionality reduction. In: Proceedings of the MLSDA 2014 2nd Workshop on Machine Learning for Sensory Data Analysis, p. 4. ACM (2014)
Google Scholar
Sakurada, M., Yairi, T.: Anomaly detection using autoencoders with nonlinear dimensionality reduction. In: Proceedings of the MLSDA 2014 2nd Workshop on Machine Learning for Sensory Data Analysis - MLSDA 2014, Gold Coast, Australia QLD, Australia, pp. 4–11. ACM Press (2014). https://doi.org/10.1145/2689746.2689747. http://dl.acm.org/citation.cfm?doid=2689746.2689747
Santos, J.M., Krajník, T., Fentanes, J.P., Duckett, T.: Lifelong information-driven exploration to complete and refine 4-d spatio-temporal maps. IEEE Robot. Autom. Lett. 1(2), 684–691 (2016). https://doi.org/10.1109/LRA.2016.2516594
Article Google Scholar
Santos, J.M., Krajník, T., Duckett, T.: Spatio-temporal exploration strategies for long-term autonomy of mobile robots. Robot. Auton. Syst. 88(C), 116–126 (2017)
Google Scholar
Shrivastava, A., Gupta, A.: Contextual priming and feedback for faster R-CNN. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 330–348. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_20
Chapter Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Szegedy, C., et al.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Google Scholar
Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: Proceedings of the 25th International Conference on Machine Learning - ICML 2008, Helsinki, Finland, pp. 1096–1103. ACM Press (2008). https://doi.org/10.1145/1390156.1390294. http://portal.acm.org/citation.cfm?doid=1390156.1390294
Zeiler, M.D.: ADADELTA: an adaptive learning rate method. arXiv:1212.5701 [cs] December 2012
Zhao, R., Ouyang, W., Li, H., Wang, X.: Saliency detection by multi-context deep learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1265–1274 (2015)
Google Scholar
Zhou, C., Paffenroth, R.C.: Anomaly detection with robust deep autoencoders. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD 2017, Halifax, NS, Canada, pp. 665–674. ACM Press (2017). https://doi.org/10.1145/3097983.3098052. http://dl.acm.org/citation.cfm?doid=3097983.3098052
Zhou, C., Paffenroth, R.C.: Anomaly detection with robust deep autoencoders. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 665–674. ACM (2017)
Google Scholar
Zhu, Y., Urtasun, R., Salakhutdinov, R., Fidler, S.: segDeepM: exploiting segmentation and context in deep neural networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4703–4711 (2015)
Google Scholar
Zong, B., et al.: Deep autoencoding gaussian mixture model for unsupervised anomaly detection (2018)
Google Scholar

Download references

Acknoledgement

The authors acknowledge the support of the Czech Science Foundation project “Towards long-term autonomy through introduction of the temporal domain into spatial representations used in robotics “20-27034J”. The calculations were performed using computational resources provided by the OP VVV MEYS funded project CZ.02.1.01/0.0/0.0/16_019/0000765 “Research Center for Informatics”.

Author information

Authors and Affiliations

Artificial Intelligence Center, FEE CTU, Prague, Czech Republic
Jan Blaha, George Broughton & Tomáš Krajník

Authors

Jan Blaha
View author publications
You can also search for this author in PubMed Google Scholar
George Broughton
View author publications
You can also search for this author in PubMed Google Scholar
Tomáš Krajník
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jan Blaha .

Editor information

Editors and Affiliations

Shanghai University, Shanghai, China
Honghao Gao
Xi’an Jiaotong-Liverpool University, Suzhou, China
Xinheng Wang
London South Bank University, London, UK
Muddesar Iqbal
Hangzhou Dianzi University, Hangzhou, China
Yuyu Yin
Zhejiang University, Hangzhou, China
Jianwei Yin
Fudan University, Shanghai, China
Ning Gu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Blaha, J., Broughton, G., Krajník, T. (2021). Boosting the Performance of Object Detection CNNs with Context-Based Anomaly Detection. In: Gao, H., Wang, X., Iqbal, M., Yin, Y., Yin, J., Gu, N. (eds) Collaborative Computing: Networking, Applications and Worksharing. CollaborateCom 2020. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 349. Springer, Cham. https://doi.org/10.1007/978-3-030-67537-0_11

Download citation

DOI: https://doi.org/10.1007/978-3-030-67537-0_11
Published: 22 January 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-67536-3
Online ISBN: 978-3-030-67537-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics