Abstract
“Catastrophic forgetting” and scalability of tasks are two major challenges of incremental learning. Both of these issues were related to the insufficient capacity of machine learning model and the insufficiently trained weights as the increasing of tasks. In this paper, we try to figure out the impact of the neural network architecture to the performance of incremental learning in the case of image classification. During the increasing of tasks, we propose to use neural network architecture searching (NAS) to find a structure that fits the new tasks collection better. We build a NAS environment with reinforcement learning as the searching strategy and Long Short-Term Memory network as the controller network. Computation operation and connecting previous nodes are selected for each layer in the search phase. For each time a new group of tasks is added, the neural network architecture is searched and reorganized according to the training data set. To speed up the searching, we design a parameter sharing mechanism, in which the same building blocks in each layer share a group of parameters. We also introduce the quantified-parameter building blocks into the NAS, to identify the best candidate during each round of searching. We test our solution in cifar100 data set, the average accuracy outperforms the current representative solutions (LwEMC, iCaRL, GANIL) by 24.92%, 5.62%, and 3.6%, respectively, the more tasks added, the better our solution performs.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Baker, B., Gupta, O., Naik, N., Raskar, R.: Designing neural network architectures using reinforcement learning. In: International Conference on Learning Representations (2017)
Cai, H., Chen, T., Zhang, W., Yu, Y., Wang, J.: Reinforcement learning for architecture search by network transformation. CoRR abs/1707.04873 (2017). http://arxiv.org/abs/1707.04873
Cai, H., Yang, J., Zhang, W., Han, S., Yu, Y.: Path-level network transformation for efficient architecture search. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 80, pp. 678–687. PMLR, Stockholmsmässan, Stockholm Sweden, 10–15 July 2018. http://proceedings.mlr.press/v80/cai18a.html
Castro, F.M., Marín-Jiménez, M.J., Guil, N., Schmid, C., Alahari, K.: End-to-end incremental learning. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11216, pp. 241–257. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01258-8_15
Cauwenberghs, G., Poggio, T.: Incremental and decremental support vector machine learning. In: Advances in Neural Information Processing Systems, pp. 409–415 (2001)
Chollet, F.: Xception: deep learning with depthwise separable convolutions. CoRR abs/1610.02357 (2016). http://arxiv.org/abs/1610.02357
Dong, J.-D., Cheng, A.-C., Juan, D.-C., Wei, W., Sun, M.: DPP-Net: device-aware progressive search for pareto-optimal neural architectures. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 540–555. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01252-6_32
Gaier, A., Ha, D.: Weight agnostic neural networks. CoRR abs/1906.04358 (2019). http://arxiv.org/abs/1906.04358
Hsu, C.H., et al.: MONAS: multi-objective neural architecture search using reinforcement learning. arXiv preprint arXiv:1806.10332 (2018)
Huang, G., Liu, Z., Der Maaten, L.V., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision & Pattern Recognition (2016)
Krizhevsky, A.: Learning multiple layers of features from tiny images (2009)
Li, X., Zhou, Y., Pan, Z., Feng, J.: Partial order pruning: for best speed/accuracy trade-off in neural architecture search. In: Proceedings of the IEEE Conference on Computer Vision & Pattern Recognition (2019)
Li, Z., Hoiem, D.: Learning without forgetting. IEEE Trans. Pattern Anal. Mach. Intell. 40(12), 2935–2947 (2017)
McCloskey, M., Cohen, N.J.: Catastrophic interference in connectionist networks: the sequential learning problem. Psychol. Learn. Motiv. 24, 109–165 (1989)
Mensink, T., Verbeek, J., Perronnin, F., Csurka, G.: Distance-based image classification: generalizing to new classes at near-zero cost. IEEE Trans. Pattern Anal. Mach. Intell. 35(11), 2624–2637 (2013)
Pham, H., Guan, M.Y., Zoph, B., Le, Q.V., Dean, J.: Efficient neural architecture search via parameter sharing. arXiv preprint arXiv:1802.03268 (2018)
Polikar, R., Udpa, L., Udpa, S., Honavar, V.: An incremental learning algorithm for supervised neural networks. IEEE Trans. SMC (C), Special Issue on Knowledge Management (2000)
Real, E., et al.: Large-scale evolution of image classifiers. In: Proceedings of the 34th International Conference on Machine Learning-Volume 70, pp. 2902–2911. JMLR. org (2017)
Rebuffi, S.A., Kolesnikov, A., Sperl, G., Lampert, C.H.: iCaRL: incremental classifier and representation learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2001–2010 (2017)
Riemer, M., et al.: Learning to learn without forgetting by maximizing transfer and minimizing interference. arXiv preprint arXiv:1810.11910 (2018)
Shmelkov, K., Schmid, C., Alahari, K.: Incremental learning of object detectors without catastrophic forgetting. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3400–3409 (2017)
Tan, M., Chen, B., Pang, R., Vasudevan, V., Sandler, M., Howard, A., Le, Q.V.: MNASNet: platform-aware neural architecture search for mobile. In: Proceedings of the IEEE Conference on Computer Vision & Pattern Recognition, pp. 2820–2828 (2019)
Wu, Y., et al.: Incremental classifier learning with generative adversarial networks. arXiv preprint arXiv:1802.00853 (2018)
Xu, J., Zhu, Z.: Reinforced continual learning. In: Advances in Neural Information Processing Systems, pp. 899–908 (2018)
Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: Proceedings of the IEEE Conference on Computer Vision & Pattern Recognition (2017)
Acknowledgement
This work is supported by the National Key Research and Development Program of China under grant 2018YFB0203901. This work is also supported by the NSF of China under grant 61732002.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Fu, X. et al. (2021). NASIL: Neural Network Architecture Searching for Incremental Learning in Image Classification. In: Ning, L., Chau, V., Lau, F. (eds) Parallel Architectures, Algorithms and Programming. PAAP 2020. Communications in Computer and Information Science, vol 1362. Springer, Singapore. https://doi.org/10.1007/978-981-16-0010-4_7
Download citation
DOI: https://doi.org/10.1007/978-981-16-0010-4_7
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-16-0009-8
Online ISBN: 978-981-16-0010-4
eBook Packages: Computer ScienceComputer Science (R0)