Abstract
Recently, large amounts of images serving as teaching image data can be prepared when a system detects objects in images using deep learning. However, the accuracy of detecting these objects is often low, because the system has not previously learned the background images of the objects. This paper proposed a method that optimizes a combination of images as teaching image data using dynamic programming (DP). First, the system created mask data, which serves as a reference for comparing with the teaching image data. The system calculated an image feature distance and a similarity of colors between the mask data and each image. Then, the system calculated the sum of the optimum feature’s distance at each specified similarity rate using DP. Then, the system determined whether each image was selected using a suitable combination of the teaching image data in the process of DP via back-calculating. It was expected that the proposed method would be effective for detecting objects in an image with a more complicated background.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Tile. https://thetileapp.jp/
Harris, C.G., Stephens, M.: A combined corner and edge detector. Proc. Alvey Vis. Conf. 15(50), 147–151 (1988)
Lindeberg, T.: Scale-space theory: a basic tool for analyzing structures at different scales. J. Appl. Stat. 21(1–2), 225–270 (1994)
Lowe, D.G.: Object recognition from local scale-invariant features. In: Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157. IEEE Press, New York (1999)
Large Scale Visual Recognition Challenge 2012 (ILSVRC2012). http://www.image-net.org/challenges/LSVRC/2012/
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105. NIPS, San Diego (2012)
Bellman, R.: Dynamic programming. Science 153(3731), 34–37 (1966)
Alcantarilla, P.F.: KAZE. http://www.robesafe.com/personal/pablo.alcantarilla/kaze.html
Alcantarilla, P.F., Bartoli, A., Davison, A.J.: KAZE Features. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7577, pp. 214–227. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33783-3_16
Dudani, S.A.: The distance-weighted k-nearest-neighbor rule. IEEE Trans. Syst. Man Cybern. 4, 325–327 (1976)
Wu, J., Rehg, J.M.: Beyond the Euclidean distance: creating effective visual codebooks using the histogram intersection kernel. In: 12th International Conference on Computer Vision, pp. 630–637. Computer Vision Foundation (2009)
COCO Annotator. https://github.com/jsbroks/coco-annotator/wiki
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
He, K., Gkioxari, G., Dollar, P., Girshick, R.: Mask R-CNN, [1703.06870] (2017)
Acknowledgement
This work was supported by JSPS KAKENHI Grant Number 17H01950.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Nakamura, K., Hamasaki, R., Oshima, C., Nakayama, K. (2020). Optimizing Combinations of Teaching Image Data for Detecting Objects in Images. In: Yamamoto, S., Mori, H. (eds) Human Interface and the Management of Information. Interacting with Information. HCII 2020. Lecture Notes in Computer Science(), vol 12185. Springer, Cham. https://doi.org/10.1007/978-3-030-50017-7_37
Download citation
DOI: https://doi.org/10.1007/978-3-030-50017-7_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-50016-0
Online ISBN: 978-3-030-50017-7
eBook Packages: Computer ScienceComputer Science (R0)