UnseenNet: Fast Training Detector for Unseen Concepts with No Bounding Boxes

Aslam, Asra; Curry, Edward

doi:10.1007/978-3-031-25825-1_2

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13836))

Included in the following conference series:

International Conference on Image and Vision Computing New Zealand

790 Accesses

Abstract

Training of object detection models using less data is currently the focus of existing N-shot learning models in computer vision. Such methods use object-level labels and takes hours to train on unseen classes. There are many cases where we have large amount of image-level labels available for training and cannot be utilized by few shot object detection models for training. There is a need for a machine learning framework that can be used for training any unseen class and can become useful in real-time situations. In this paper, we proposed an “Unseen Class Detector” that can be trained within a short time for any possible unseen class without bounding boxes with competitive accuracy. We build our approach on “Strong” and “Weak” baseline detectors, which we trained on object detection and image classification datasets, respectively. Unseen concepts are fine-tuned on the strong baseline detector using only image-level labels and further adapted by transferring the classifier-detector knowledge between baselines. We use semantic as well as visual similarities to identify the source class (i.e. Sheep) for the fine-tuning and adaptation of unseen class (i.e. Goat). Our model (UnseenNet) is trained on the ImageNet classification dataset for unseen classes and tested on an object detection dataset (OpenImages). UnseenNet improves the mean average precision (mAP) by 10% to 30% over existing baselines (semi-supervised and few-shot) of object detection. Moreover, training time of proposed model is \(<10\) min for each unseen class.

https://github.com/Asra-Aslam/UnseenNet.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Zero-Shot Object Detection: Learning to Simultaneously Recognize and Localize Novel Concepts

Any-Shot Object Detection

FSODv2: A Deep Calibrated Few-Shot Object Detection Network

Article 04 April 2024

Notes

References

Aslam, A., Curry, E.: A survey on object detection for the internet of multimedia things (IOMT) using deep learning and event-based middleware: Approaches, challenges, and future directions. Image Vis. Comput. 106, 104095 (2021)
Article Google Scholar
Bilen, H., Vedaldi, A.: Weakly supervised deep detection networks. In: Proceedings of the IEEE CVPR, pp. 2846–2854 (2016)
Google Scholar
Chen, H., Wang, Y., Wang, G., Qiao, Y.: LSTD: a low-shot transfer detector for object detection. In: Proceedings of the AAAI conference, vol. 32 (2018)
Google Scholar
Cinbis, R.G., Verbeek, J., Schmid, C.: Weakly supervised object localization with multi-fold multiple instance learning. IEEE Trans. Pattern Anal. Mach. Intell. 39(1), 189–203 (2016)
Article Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009, pp. 248–255. IEEE (2009)
Google Scholar
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vision 88(2), 303–338 (2010)
Article Google Scholar
Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Google Scholar
Gokberk Cinbis, R., Verbeek, J., Schmid, C.: Multi-fold mil training for weakly supervised object localization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2409–2416 (2014)
Google Scholar
Hoffman, J., et al.: LSDA: large scale detection through adaptation. In: Advances in Neural Information Processing Systems, pp. 3536–3544 (2014)
Google Scholar
Howard, A., et al.: Searching for mobilenetv3. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1314–1324 (2019)
Google Scholar
Kang, B., Liu, Z., Wang, X., Yu, F., Feng, J., Darrell, T.: Few-shot object detection via feature reweighting. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8420–8429 (2019)
Google Scholar
Kolesnikov, A., Lampert, C.H.: Improving weakly-supervised object localization by micro-annotation. arXiv preprint arXiv:1605.05538 (2016)
Krasin, I., et al.: Openimages: A public dataset for large-scale multi-label and multi-class image classification. Dataset. 2, 3 (2017) https://github.com/openimages
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
LeCun, Y., Cortes, C., Burges, C.: MNIST handwritten digit database. ATT Labs. 2 (2010). http://yann.lecun.com/exdb/mnist
Li, Y., Zhang, J., Huang, K., Zhang, J.: Mixed supervised object detection with robust objectness transfer. IEEE Trans. Pattern Anal. Mach. Intell. 41(3), 639–653 (2018)
Article Google Scholar
Li, Y., et al.: Few-shot object detection via classification refinement and distractor retreatment. In: Proceedings of the IEEE/CVF CVPR, pp. 15395–15403 (2021)
Google Scholar
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE ICCV, pp. 2980–2988 (2017)
Google Scholar
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Chapter Google Scholar
Pedersen, T., Patwardhan, S., Michelizzi, J.: Wordnet: similarity: measuring the relatedness of concepts. In: Demonstration papers at HLT-NAACL 2004, pp. 38–41. Association for Computational Linguistics (2004)
Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)
Google Scholar
Redmon, J., Farhadi, A.: Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Google Scholar
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vision 115(3), 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y
Article MathSciNet Google Scholar
Sun, B., Li, B., Cai, S., Yuan, Y., Zhang, C.: FSCE: few-shot object detection via contrastive proposal encoding. In: Proceedings of the IEEE/CVF CVPR (2021)
Google Scholar
Tang, P., et al.: Weakly supervised region proposal network and object detection. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 370–386. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01252-6_22
Chapter Google Scholar
Tang, Y., Wang, J., Gao, B., Dellandréa, E., Gaizauskas, R., Chen, L.: Large scale semi-supervised object detection using visual and semantic knowledge transfer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2119–2128 (2016)
Google Scholar
Tang, Y., et al.: Visual and semantic knowledge transfer for large scale semi-supervised object detection. IEEE Trans. Pattern Anal. Mach. Intell. 40(12), 3045–3058 (2017)
Article Google Scholar
Uijlings, J., Popov, S., Ferrari, V.: Revisiting knowledge transfer for training object class detectors. In: Proceedings of the IEEE CVPR, pp. 1101–1110 (2018)
Google Scholar
Wang, X., Huang, T.E., Darrell, T., Gonzalez, J.E., Yu, F.: Frustratingly simple few-shot object detection. In: ICML (2020)
Google Scholar
Wang, Y.X., Ramanan, D., Hebert, M.: Meta-learning to detect rare objects. In: Proceedings of the IEEE/CVF ICCV, pp. 9925–9934 (2019)
Google Scholar
Wu, J., Liu, S., Huang, D., Wang, Y.: Multi-scale positive sample refinement for few-shot object detection. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12361, pp. 456–472. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58517-4_27
Chapter Google Scholar
Yan, X., Chen, Z., Xu, A., Wang, X., Liang, X., Lin, L.: Meta R-CNN: towards general solver for instance-level low-shot learning. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9577–9586 (2019)
Google Scholar
Zeng, Z., Liu, B., Fu, J., Chao, H., Zhang, L.: Wsod2: learning bottom-up and top-down objectness distillation for weakly-supervised object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 8292–8300 (2019)
Google Scholar
Zheng, Y., Cui, L.: Zero-shot object detection with transformers. In: 2021 IEEE International Conference on Image Processing (ICIP), pp. 444–448. IEEE (2021)
Google Scholar

Download references

Author information

Authors and Affiliations

Data Science Institute, University of Galway, Galway, Ireland
Asra Aslam & Edward Curry

Authors

Asra Aslam
View author publications
You can also search for this author in PubMed Google Scholar
Edward Curry
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Asra Aslam .

Editor information

Editors and Affiliations

Auckland University of Technology, Auckland, New Zealand
Wei Qi Yan
Auckland University of Technology, Auckland, New Zealand
Minh Nguyen
Auckland University of Technology, Auckland, New Zealand
Martin Stommel

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Aslam, A., Curry, E. (2023). UnseenNet: Fast Training Detector for Unseen Concepts with No Bounding Boxes. In: Yan, W.Q., Nguyen, M., Stommel, M. (eds) Image and Vision Computing. IVCNZ 2022. Lecture Notes in Computer Science, vol 13836. Springer, Cham. https://doi.org/10.1007/978-3-031-25825-1_2

Download citation

DOI: https://doi.org/10.1007/978-3-031-25825-1_2
Published: 04 February 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25824-4
Online ISBN: 978-3-031-25825-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

UnseenNet: Fast Training Detector for Unseen Concepts with No Bounding Boxes

Abstract

Access this chapter

Similar content being viewed by others

Zero-Shot Object Detection: Learning to Simultaneously Recognize and Localize Novel Concepts

Any-Shot Object Detection

FSODv2: A Deep Calibrated Few-Shot Object Detection Network

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

UnseenNet: Fast Training Detector for Unseen Concepts with No Bounding Boxes

Abstract

Access this chapter

Similar content being viewed by others

Zero-Shot Object Detection: Learning to Simultaneously Recognize and Localize Novel Concepts

Any-Shot Object Detection

FSODv2: A Deep Calibrated Few-Shot Object Detection Network

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation