Efficient object detection using convolutional neural network-based hierarchical feature modeling

Lee, Byungjae; Erdenee, Enkhbayar; Jin, Songguo; Rhee, Phill Kyu

doi:10.1007/s11760-016-0962-x

Efficient object detection using convolutional neural network-based hierarchical feature modeling

Original Paper
Published: 01 September 2016

Volume 10, pages 1503–1510, (2016)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Byungjae Lee¹,
Enkhbayar Erdenee¹,
Songguo Jin¹ &
…
Phill Kyu Rhee¹

835 Accesses
15 Citations
Explore all metrics

Abstract

A hierarchical data-driven object detection framework is addressed considering a deep feature hierarchy of object appearances. The performance of many object detectors is degraded due to ambiguities in inter-class appearances and variations in intra-class appearances, but deep features extracted from visual objects show a strong hierarchical clustering property. Deep features were partitioned into unsupervised super-categories at the inter-class level, and augmented categories at the object level, to discover deep feature-driven information. A hierarchical feature model is built using a latent topic model algorithm, assembling a one-versus-all support vector machine at each node to constitute a hierarchical classification ensemble. Extensive experiments show that the proposed method is superior to state-of-the-art techniques using the PASCAL VOC 2007 and VOC 2012 datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

SSD: Single Shot MultiBox Detector

End-to-End Object Detection with Transformers

References

Dong, J., Chen, Q., Feng, J., Jia, K., Huang, Z., Yan, S.: Looking inside category: subcategory-aware object recognition. IEEE Trans. Circuits Syst. Video Technol. 25(8), 1322–1334 (2015)
Article Google Scholar
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)
Article Google Scholar
Song, Z., Chen, Q., Huang, Z., Hua, Y., Yan, S.: Contextualizing object detection and classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1585–1592 (2011)
Cinaroglu, I., Bastanlar, Y.: A direct approach for object detection with catadioptric omnidirectional cameras. Signal Image Video Process. 10(2), 413–420 (2016)
Article Google Scholar
Fusek, R., Sojka, E.: Energy transfer features combined with DCT for object detection. Signal Image Video Process. 10(3), 479–486 (2016)
Article Google Scholar
Takarli, F., Aghagolzadeh, A., Seyedarabi, H.: Combination of high-level features with low-level features for detection of pedestrian. Signal Image Video Process. 10(1), 93–101 (2016)
Article Google Scholar
Park, D., Ramanan, D., Fowlkes, C.: Multiresolution models for object detection. In: Proceedings of the IEEE European Conference Computer Vision, pp. 241–254 (2010)
Gu, C., Ren, X.: Discriminative mixture-of-templates for viewpoint classification. In: Proceedings of the IEEE European Conference Computer Vision, pp. 408-421 (2010)
Bourdev, L., Maji, S., Brox, T., Malik, J.: Detecting people using mutually consistent poselet activations. In: Proceedings of the IEEE European Conference Computer Vision, pp. 168–181 (2010)
Malisiewicz, T., Gupta, A., Efros, A. A.: Ensemble of exemplar-svms for object detection and beyond. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 89–96 (2011)
Gu, C., Arbelez, P., Lin, Y., Yu, K., Malik, J.: Multi-component models for object detection. In: Proceedings of the IEEE European Conference Computer Vision, pp. 445–458 (2012)
Divvala, S.K., Efros, A.A., Hebert, M.: How important are Deformable Parts in the Deformable Parts Model? In: Proceedings of the IEEE European Conference Computer Vision, Workshops and Demonstrations, pp. 31–40 (2012)
Zhu, X., Vondrick, C., Ramanan, D., Fowlkes, C.: Do We Need More Training Data or Better Models for Object Detection?. In: BMVC, vol. 3, p. 5 (2012)
Aghazadeh, O., Azizpour, H., Sullivan, J., Carlsson, S.: Mixture component identification and learning for visual recognition. In: Proceedings of the IEEE European Conference Computer Vision, pp. 115–128 (2012)
Ruan, Z., Wang, G., Xue, J.H., Lin, X.: Subcategory clustering with latent feature alignment and filtering for object detection. Signal Process. Lett. IEEE 22(2), 244–248 (2015)
Article Google Scholar
Ding, K., Huo, C., Xu, Y., Zhong, Z., Pan, C.: Sparse hierarchical clustering for VHR image change detection. Geosci. Remote Sens. Lett. IEEE 12(3), 577–581 (2015)
Article Google Scholar
Yu, X., Yang, J., Lin, Z., Wang, J., Wang, T., Huang, T.: Subcategory-aware object detection. Signal Process. Lett. IEEE 22(9), 1472–1476 (2015)
Article Google Scholar
Zitnick, C. L., Dollr, P.: Edge boxes: locating object proposals from edges. In: Proceedings of the IEEE European Conference Computer Vision, pp. 391–405 (2014)
He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015)
Article Google Scholar
Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Goh, K.S., Chang, E.Y., Li, B.: Using one-class and two-class SVMs for multiclass image annotation. IEEE Trans. Knowl. Data Eng. 17(10), 1333–1346 (2005)
Article Google Scholar
Wang, L., Qiao, Y., Tang, X.: Latent hierarchical model of temporal structure for complex activity classification. IEEE Trans. Image Process. 23(2), 810–822 (2014)
Article MathSciNet Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
MATH Google Scholar
Cheng, D., Wang, J., Wei, X., Gong, Y.: Training mixture of weighted SVM for object detection using EM algorithm. Neurocomputing 149, 473–482 (2015)
Article Google Scholar
Platt, J.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv. Large Margin Classif. 10(3), 61–74 (1999)
Google Scholar
Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., Zisserman, A.: The PASCAL visual object classes challenge 2012 (2012)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Gidaris, S., Komodakis, N.: LocNet: Improving Localization Accuracy for Object Detection. arXiv preprint arXiv:1511.07763 (2015)
Gidaris, S., Komodakis, N.: Object detection via a multi-region and semantic segmentation-aware cnn model. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1134–1142 (2015)
Kong, T., Yao, A., Chen, Y., Sun, F.: HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection. arXiv preprint arXiv:1604.00600 (2016)
Hoiem, D., Chodpathumwan, Y., Dai, Q.: Diagnosing error in object detectors. In: Proceedings of the IEEE European Conference Computer Vision, pp. 340–353 (2012)
Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Zitnick, C. L.: Microsoft coco: common objects in context. In: Proceedings of the IEEE European Conference Computer Vision, pp. 740–755 (2014)

Download references

Acknowledgments

This work was supported by an Inha University research grant. A GPU used in this research was generously donated by NVIDIA Corporation.

Author information

Authors and Affiliations

Inha University, 235 Yong-Hyun Dong, Nam Ku, Incheon, South Korea
Byungjae Lee, Enkhbayar Erdenee, Songguo Jin & Phill Kyu Rhee

Authors

Byungjae Lee
View author publications
You can also search for this author in PubMed Google Scholar
Enkhbayar Erdenee
View author publications
You can also search for this author in PubMed Google Scholar
Songguo Jin
View author publications
You can also search for this author in PubMed Google Scholar
Phill Kyu Rhee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Phill Kyu Rhee.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (docx 4923 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lee, B., Erdenee, E., Jin, S. et al. Efficient object detection using convolutional neural network-based hierarchical feature modeling. SIViP 10, 1503–1510 (2016). https://doi.org/10.1007/s11760-016-0962-x

Download citation

Received: 26 March 2016
Revised: 14 July 2016
Accepted: 05 August 2016
Published: 01 September 2016
Issue Date: November 2016
DOI: https://doi.org/10.1007/s11760-016-0962-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient object detection using convolutional neural network-based hierarchical feature modeling

Abstract

Access this article

Similar content being viewed by others

Object detection using YOLO: challenges, architectural successors, datasets and applications

SSD: Single Shot MultiBox Detector

End-to-End Object Detection with Transformers

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (docx 4923 KB)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Efficient object detection using convolutional neural network-based hierarchical feature modeling

Abstract

Access this article

Similar content being viewed by others

Object detection using YOLO: challenges, architectural successors, datasets and applications

SSD: Single Shot MultiBox Detector

End-to-End Object Detection with Transformers

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Supplementary material 1 (docx 4923 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation