Skip to main content

Structured Prediction for Object Detection in Deep Neural Networks

  • Conference paper
Artificial Neural Networks and Machine Learning – ICANN 2014 (ICANN 2014)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8681))

Included in the following conference series:

Abstract

Deep convolutional neural networks are currently applied to computer vision tasks, especially object detection. Due to the large dimensionality of the output space, four dimensions per bounding box of an object, classification techniques do not apply easily. We propose to adapt a structured loss function for neural network training which directly maximizes overlap of the prediction with ground truth bounding boxes. We show how this structured loss can be implemented efficiently, and demonstrate bounding box prediction on two of the Pascal VOC 2007 classes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Krizhevsky, A., Sutskever, I., Hinton, G.: Imagenet classification with deep convolutional neural networks. In: Adv. In Neural Information Processing Systems (2012)

    Google Scholar 

  2. Schulz, H., Behnke, S.: Learning object-class segmentation with convolutional neural networks. In: Eur. Symp. on Art. Neural Networks (2012)

    Google Scholar 

  3. Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv: 1207.0580 (2012)

    Google Scholar 

  4. Scherer, D., Müller, A., Behnke, S.: Evaluation of pooling operations in convolutional architectures for object recognition. In: Diamantaras, K., Duch, W., Iliadis, L.S. (eds.) ICANN 2010, Part III. LNCS, vol. 6354, pp. 92–101. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  5. Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786) (2006)

    Google Scholar 

  6. Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H., et al.: Greedy layer-wise training of deep networks. In: Adv. in Neural Information Processing Systems 19 (2007)

    Google Scholar 

  7. Szegedy, C., Toshev, A., Erhan, D.: Deep Neural Networks for Object Detection. In: Adv. in Neural Information Processing Systems (2013)

    Google Scholar 

  8. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. arXiv: 1311.2524 (2013)

    Google Scholar 

  9. Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Over-Feat: Integrated Recognition, Localization and Detection using Convolutional Networks, arXiv: 1312.6229 (2013)

    Google Scholar 

  10. Uijlings, J., van de Sande, K., Gevers, T., Smeulders, A.: Selective search for object recognition. Int. Journal of Computer Vision 104(2) (2013)

    Google Scholar 

  11. Erhan, D., Szegedy, C., Toshev, A., Anguelov, D.: Scalable Object Detection using Deep Neural Networks. arXiv: 1312.2249 (2013)

    Google Scholar 

  12. Lampert, C.H., Blaschko, M.B., Hofmann, T.: Efficient subwindow search: A branch and bound framework for object localization. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(12) (2009)

    Google Scholar 

  13. Lampert, C.H.: Maximum Margin Multi-Label Structured Prediction. In: Adv. in Neural Information Processing Systems, vol. 11 (2011)

    Google Scholar 

  14. Taskar, B., Chatalbashev, V., Koller, D., Guestrin, C.: Learning structured prediction models: A large margin approach. In: Int. Conf. on Machine Learning (2005)

    Google Scholar 

  15. Zhu, X., Vondrick, C., Ramanan, D., Fowlkes, C.: Do We Need More Training Data or Better Models for Object Detection? In: British Machine Vision Conference (2012)

    Google Scholar 

  16. Goodfellow, I.J., Warde-Farley, D., Mirza, M., Courville, A., Bengio, Y.: Maxout networks. In: Int. Conf. on Machine Learning (2013)

    Google Scholar 

  17. Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. The Journal of Machine Learning Research 12 (2011)

    Google Scholar 

  18. Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. Journal of Computer Vision 88(2) (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Schulz, H., Behnke, S. (2014). Structured Prediction for Object Detection in Deep Neural Networks. In: Wermter, S., et al. Artificial Neural Networks and Machine Learning – ICANN 2014. ICANN 2014. Lecture Notes in Computer Science, vol 8681. Springer, Cham. https://doi.org/10.1007/978-3-319-11179-7_50

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11179-7_50

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11178-0

  • Online ISBN: 978-3-319-11179-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics