Abstract
We propose a fast hybrid statistical and graph-based sample preselection method for speeding up CNN training process. To do so, we process each class separately: some candidates are first extracted based on their distances to the class mean. Then, we structure all the candidates in a graph representation and use it to extract the final set of preselected samples. The proposed method is evaluated and discussed based on an image classification task, on three data sets that contain up to several hundred thousands of images.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20, 273–297 (1995)
Garcia, S., Derrac, J., Cano, J., Herrera, F.: Prototype selection for nearest neighbor classification: taxonomy and empirical study. IEEE Trans. Pattern Anal. Mach. Intell. 34, 417–435 (2012)
Goto, M., Ishida, R., Uchida, S.: Preselection of support vector candidates by relative neighborhood graph for large-scale character recognition. In: ICDAR, pp. 306–310 (2015)
Jankowski, N., Grochowski, M.: Comparison of instances seletion algorithms I. Algorithms survey. In: Rutkowski, L., Siekmann, J.H., Tadeusiewicz, R., Zadeh, L.A. (eds.) ICAISC 2004. LNCS (LNAI), vol. 3070, pp. 598–603. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24844-6_90
Jung, H.G., Kim, G.: Support vector number reduction: survey and experimental evaluations. IEEE Trans. ITS 15, 463–476 (2014)
Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Technical report, Computer Science Department, University of Toronto (2012)
Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998)
Lee, Y.J., Huang, S.Y.: Reduced support vector machines: a statistical theory. IEEE Trans. Neural Netw. 18, 1–13 (2007)
Rayar, F., Goto, M., Uchida, S.: CNN training with graph-based sample preselection: application to handwritten character recognition. CoRR abs/1712.02122 (2017)
Razafindramanana, O., Rayar, F., Venturini, G.: Alpha*-approximated delaunay triangulation based descriptors for handwritten character recognition. In: ICDAR, pp. 440–444 (2013)
Torralba, A., Fergus, R., Freeman, W.T.: 80 million tiny images: a large data set for nonparametric object and scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 30, 1958–1970 (2008)
Toussaint, G.T., Bhattacharya, B.K., Poulsen, R.S.: The application of Voronoi diagrams to non-parametric decision rules. Comput. Sci. Stat. 97–108 (1985)
Toussaint, G.T.: Some unsolved problems on proximity graphs (1991)
Toussaint, G.T., Berzan, C.: Proximity-graph instance-based learning, support vector machines, and high dimensionality: an empirical comparison. In: Perner, P. (ed.) MLDM 2012. LNCS (LNAI), vol. 7376, pp. 222–236. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31537-4_18
Tran, Q.A., Zhang, Q.L., Li, X.: Reduce the number of support vectors by using clustering techniques. In: ICMLC, pp. 1245–1248 (2003)
Uchida, S., Ide, S., Iwana, B.K., Zhu, A.: A further step to perfect accuracy by training CNN with larger data. In: ICFHR, pp. 405–410 (2016)
Acknowledgement
This research was partially supported by MEXT-Japan (Grant No. 17H06100).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Rayar, F., Uchida, S. (2018). On Fast Sample Preselection for Speeding up Convolutional Neural Network Training. In: Bai, X., Hancock, E., Ho, T., Wilson, R., Biggio, B., Robles-Kelly, A. (eds) Structural, Syntactic, and Statistical Pattern Recognition. S+SSPR 2018. Lecture Notes in Computer Science(), vol 11004. Springer, Cham. https://doi.org/10.1007/978-3-319-97785-0_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-97785-0_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-97784-3
Online ISBN: 978-3-319-97785-0
eBook Packages: Computer ScienceComputer Science (R0)