Optimizing Combinations of Teaching Image Data for Detecting Objects in Images

Nakamura, Keisuke; Hamasaki, Ryodai; Oshima, Chika; Nakayama, Koichi

doi:10.1007/978-3-030-50017-7_37

Keisuke Nakamura¹⁰,
Ryodai Hamasaki¹⁰,
Chika Oshima¹⁰ &
…
Koichi Nakayama¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12185))

Included in the following conference series:

International Conference on Human-Computer Interaction

1088 Accesses

Abstract

Recently, large amounts of images serving as teaching image data can be prepared when a system detects objects in images using deep learning. However, the accuracy of detecting these objects is often low, because the system has not previously learned the background images of the objects. This paper proposed a method that optimizes a combination of images as teaching image data using dynamic programming (DP). First, the system created mask data, which serves as a reference for comparing with the teaching image data. The system calculated an image feature distance and a similarity of colors between the mask data and each image. Then, the system calculated the sum of the optimum feature’s distance at each specified similarity rate using DP. Then, the system determined whether each image was selected using a suitable combination of the teaching image data in the process of DP via back-calculating. It was expected that the proposed method would be effective for detecting objects in an image with a more complicated background.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Tile. https://thetileapp.jp/
Harris, C.G., Stephens, M.: A combined corner and edge detector. Proc. Alvey Vis. Conf. 15(50), 147–151 (1988)
Google Scholar
Lindeberg, T.: Scale-space theory: a basic tool for analyzing structures at different scales. J. Appl. Stat. 21(1–2), 225–270 (1994)
Article Google Scholar
Lowe, D.G.: Object recognition from local scale-invariant features. In: Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157. IEEE Press, New York (1999)
Google Scholar
Large Scale Visual Recognition Challenge 2012 (ILSVRC2012). http://www.image-net.org/challenges/LSVRC/2012/
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105. NIPS, San Diego (2012)
Google Scholar
Bellman, R.: Dynamic programming. Science 153(3731), 34–37 (1966)
Article Google Scholar
Alcantarilla, P.F.: KAZE. http://www.robesafe.com/personal/pablo.alcantarilla/kaze.html
Alcantarilla, P.F., Bartoli, A., Davison, A.J.: KAZE Features. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7577, pp. 214–227. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33783-3_16
Chapter Google Scholar
Dudani, S.A.: The distance-weighted k-nearest-neighbor rule. IEEE Trans. Syst. Man Cybern. 4, 325–327 (1976)
Article Google Scholar
Wu, J., Rehg, J.M.: Beyond the Euclidean distance: creating effective visual codebooks using the histogram intersection kernel. In: 12th International Conference on Computer Vision, pp. 630–637. Computer Vision Foundation (2009)
Google Scholar
COCO Annotator. https://github.com/jsbroks/coco-annotator/wiki
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask R-CNN, [1703.06870] (2017)
Google Scholar

Download references

Acknowledgement

This work was supported by JSPS KAKENHI Grant Number 17H01950.

Author information

Authors and Affiliations

Saga University, Saga, 840-8502, Japan
Keisuke Nakamura, Ryodai Hamasaki, Chika Oshima & Koichi Nakayama

Authors

Keisuke Nakamura
View author publications
You can also search for this author in PubMed Google Scholar
Ryodai Hamasaki
View author publications
You can also search for this author in PubMed Google Scholar
Chika Oshima
View author publications
You can also search for this author in PubMed Google Scholar
Koichi Nakayama
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chika Oshima .

Editor information

Editors and Affiliations

Tokyo University of Science, Tokyo, Japan
Sakae Yamamoto
Tokyo City University, Tokyo, Japan
Hirohiko Mori

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nakamura, K., Hamasaki, R., Oshima, C., Nakayama, K. (2020). Optimizing Combinations of Teaching Image Data for Detecting Objects in Images. In: Yamamoto, S., Mori, H. (eds) Human Interface and the Management of Information. Interacting with Information. HCII 2020. Lecture Notes in Computer Science(), vol 12185. Springer, Cham. https://doi.org/10.1007/978-3-030-50017-7_37

Download citation

DOI: https://doi.org/10.1007/978-3-030-50017-7_37
Published: 10 July 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-50016-0
Online ISBN: 978-3-030-50017-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics