Abstract
This paper proposes a novel regularization approach to bias Convolutional Neural Networks (CNNs) toward utilizing edge and line features in their hidden layers. Rather than learning arbitrary kernels, we constrain the convolution layers to edge and line detection kernels. This intentional bias regularizes the models, improving generalization performance, especially on small datasets. As a result, test accuracies improve by margins of \(5-11\) percentage points across four challenging fine-grained classification datasets with limited training data and an identical number of trainable parameters. Instead of traditional convolutional layers, we use Pre-defined Filter Modules, which convolve input data using a fixed set of \(3 \times 3\) pre-defined edge and line filters. A subsequent ReLU erases information that did not trigger any positive response. Next, a \(1 \times 1\) convolutional layer generates linear combinations. Notably, the pre-defined filters are a fixed component of the architecture, remaining unchanged during the training phase. Our findings reveal that the number of dimensions spanned by the set of pre-defined filters has a low impact on recognition performance. However, the size of the set of filters matters, with nine or more filters providing optimal results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Belkin, M., Hsu, D., Ma, S., Mandal, S.: Reconciling modern machine-learning practice and the classical bias-variance trade-off. Proc. Natl. Acad. Sci. 116(32), 15849–15854 (2019)
Gavrikov, P., Keuper, J.: CNN filter DB: an empirical investigation of trained convolutional filters. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 19044–19054. IEEE, New Orleans (2022)
Gavrikov, P., Keuper, J.: Rethinking 1x1 convolutions: can we train CNNs with frozen random filters? (2023). arXiv preprint arXiv:2301.11360 [cs]
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1026–1034. IEEE, Santiago (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778. IEEE, Las Vegas (2016)
Hertel, L., Barth, E., Käster, T., Martinetz, T.: Deep convolutional neural networks as generic feature extractors. arXiv preprint arXiv:1710.02286 [cs] (2017)
Howard, A.G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
Krause, J., Stark, M., Deng, J., Fei-Fei, L.: 3D Object rpresentations for fine-grained categorization. In: 2013 IEEE International Conference on Computer Vision Workshops, pp. 554–561. IEEE, Sydney (2013)
Linse, C., Barth, E., Martinetz, T.: Convolutional neural networks do work with pre-defined filters. In: 2023 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2023)
Linse, C., Martinetz, T.: Large neural networks learning from scratch with very few data and without explicit regularization. In: Proceedings of the 2023 15th International Conference on Machine Learning and Computing (ICMLC 2023), pp. 279-283. Association for Computing Machinery, New York (2023)
Maji, S., Rahtu, E., Kannala, J., Blaschko, M., Vedaldi, A.: Fine-grained visual classification of aircraft. arXiv preprint arXiv:1306.5151 [cs] (2013)
Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th International Conference on International Conference on Machine Learning (ICML 2010), pp. 807–814. Omnipress, Madison (2010), event-place: Haifa, Israel
Nilsback, M.E., Zisserman, A.: Automated flower classification over a large number of classes. In: 2008 Sixth Indian Conference on Computer Vision. Graphics and Image Processing, pp. 722–729. IEEE, Bhubaneswar (2008)
Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: Wallach, H., Larochelle, H., Beygelzimer, A., AlchÃ-Buc, F.d., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 32, pp. 8024–8035. Curran Associates, Inc. (2019)
Ramanujan, V., Wortsman, M., Kembhavi, A., Farhadi, A., Rastegari, M.: What’s hidden in a randomly weighted neural network? arXiv preprint arXiv:1911.13299 [cs] (2020)
Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vision 115(3), 211–252 (2015)
Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The Caltech-UCSD Birds-200-2011 Dataset. Tech. Rep. CNS-TR-2011-001, California Institute of Technology (2011)
Wimmer, P., Mehnert, J., Condurache, A.: Interspace pruning: using adaptive filter representations to improve training of sparse CNNs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12527–12537 (2022)
Acknowledgment
The work of Christoph Linse was supported by the Bundesministerium für Wirtschaft und Klimaschutz through the Mittelstand-Digital Zentrum Schleswig-Holstein Project.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix
Appendix
1.1 Training Hyperparameters for ImageNet
The networks are trained with a batch size of 48 on five NVIDIA GeForce RTX 4090 GPUs. The remaining training hyperparameters are taken from the training reference of PyTorch [15]. The cross-entropy loss is minimized using stochastic gradient descent for 90 epochs with a momentum of 0.9 and weight-decay 0.0001. The initial learning rate of 0.1 is reduced by a factor of 0.1 every 30 epochs.
1.2 Linear Independency of ReLU-Based Functions
Two functions \(f_1, f_2 : X \rightarrow Y\) are linearly independent if
Consider the functions from (7) that occur in the \(\text {PFM}\) module. The functions are linearly dependent if the pre-defined filters \(\tilde{w}_1\) and \(\tilde{w}_2\) are linearly dependent and \(a \tilde{w}_1 = \tilde{w}_2, a \ge 0\).
Proof
Choose some arbitrary \(\textbf{x} \in \mathbb {R}^{M \times N}\). Choose \(c_1 \in \mathbb {R} \backslash \{0\}\) and \(c_2 = - c_1 / a\). Then,
\(\Box \)
The functions in (7) are linearly independent if \(\tilde{w}_1\) and \(\tilde{w}_2\) are linearly dependent and \(a \tilde{w}_1 = \tilde{w}_2, a < 0\).
Proof
The \(\leftarrow \) direction is clear. To show the \(\rightarrow \) direction, let \(\textbf{x} \in \mathbb {R}^{M \times N}\).
The sum has to be zero for all \(\textbf{x} \in \mathbb {R}^{M \times N}\). This means that both coefficients \(c_1\) and \(c_2\) have to be zero. \(\Box \)
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Linse, C., Brückner, B., Martinetz, T. (2024). Enhancing Generalization in Convolutional Neural Networks Through Regularization with Edge and Line Features. In: Wand, M., Malinovská, K., Schmidhuber, J., Tetko, I.V. (eds) Artificial Neural Networks and Machine Learning – ICANN 2024. ICANN 2024. Lecture Notes in Computer Science, vol 15016. Springer, Cham. https://doi.org/10.1007/978-3-031-72332-2_28
Download citation
DOI: https://doi.org/10.1007/978-3-031-72332-2_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72331-5
Online ISBN: 978-3-031-72332-2
eBook Packages: Computer ScienceComputer Science (R0)