Skip to main content

Optimizing Combinations of Teaching Image Data for Detecting Objects in Images

  • Conference paper
  • First Online:
Human Interface and the Management of Information. Interacting with Information (HCII 2020)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12185))

Included in the following conference series:

  • 1088 Accesses

Abstract

Recently, large amounts of images serving as teaching image data can be prepared when a system detects objects in images using deep learning. However, the accuracy of detecting these objects is often low, because the system has not previously learned the background images of the objects. This paper proposed a method that optimizes a combination of images as teaching image data using dynamic programming (DP). First, the system created mask data, which serves as a reference for comparing with the teaching image data. The system calculated an image feature distance and a similarity of colors between the mask data and each image. Then, the system calculated the sum of the optimum feature’s distance at each specified similarity rate using DP. Then, the system determined whether each image was selected using a suitable combination of the teaching image data in the process of DP via back-calculating. It was expected that the proposed method would be effective for detecting objects in an image with a more complicated background.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Tile. https://thetileapp.jp/

  2. Harris, C.G., Stephens, M.: A combined corner and edge detector. Proc. Alvey Vis. Conf. 15(50), 147–151 (1988)

    Google Scholar 

  3. Lindeberg, T.: Scale-space theory: a basic tool for analyzing structures at different scales. J. Appl. Stat. 21(1–2), 225–270 (1994)

    Article  Google Scholar 

  4. Lowe, D.G.: Object recognition from local scale-invariant features. In: Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157. IEEE Press, New York (1999)

    Google Scholar 

  5. Large Scale Visual Recognition Challenge 2012 (ILSVRC2012). http://www.image-net.org/challenges/LSVRC/2012/

  6. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105. NIPS, San Diego (2012)

    Google Scholar 

  7. Bellman, R.: Dynamic programming. Science 153(3731), 34–37 (1966)

    Article  Google Scholar 

  8. Alcantarilla, P.F.: KAZE. http://www.robesafe.com/personal/pablo.alcantarilla/kaze.html

  9. Alcantarilla, P.F., Bartoli, A., Davison, A.J.: KAZE Features. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7577, pp. 214–227. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33783-3_16

    Chapter  Google Scholar 

  10. Dudani, S.A.: The distance-weighted k-nearest-neighbor rule. IEEE Trans. Syst. Man Cybern. 4, 325–327 (1976)

    Article  Google Scholar 

  11. Wu, J., Rehg, J.M.: Beyond the Euclidean distance: creating effective visual codebooks using the histogram intersection kernel. In: 12th International Conference on Computer Vision, pp. 630–637. Computer Vision Foundation (2009)

    Google Scholar 

  12. COCO Annotator. https://github.com/jsbroks/coco-annotator/wiki

  13. Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48

    Chapter  Google Scholar 

  14. He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask R-CNN, [1703.06870] (2017)

    Google Scholar 

Download references

Acknowledgement

This work was supported by JSPS KAKENHI Grant Number 17H01950.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chika Oshima .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Nakamura, K., Hamasaki, R., Oshima, C., Nakayama, K. (2020). Optimizing Combinations of Teaching Image Data for Detecting Objects in Images. In: Yamamoto, S., Mori, H. (eds) Human Interface and the Management of Information. Interacting with Information. HCII 2020. Lecture Notes in Computer Science(), vol 12185. Springer, Cham. https://doi.org/10.1007/978-3-030-50017-7_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-50017-7_37

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-50016-0

  • Online ISBN: 978-3-030-50017-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics