Skip to main content

Enhancing Generalization in Convolutional Neural Networks Through Regularization with Edge and Line Features

  • Conference paper
  • First Online:
Artificial Neural Networks and Machine Learning – ICANN 2024 (ICANN 2024)

Abstract

This paper proposes a novel regularization approach to bias Convolutional Neural Networks (CNNs) toward utilizing edge and line features in their hidden layers. Rather than learning arbitrary kernels, we constrain the convolution layers to edge and line detection kernels. This intentional bias regularizes the models, improving generalization performance, especially on small datasets. As a result, test accuracies improve by margins of \(5-11\) percentage points across four challenging fine-grained classification datasets with limited training data and an identical number of trainable parameters. Instead of traditional convolutional layers, we use Pre-defined Filter Modules, which convolve input data using a fixed set of \(3 \times 3\) pre-defined edge and line filters. A subsequent ReLU erases information that did not trigger any positive response. Next, a \(1 \times 1\) convolutional layer generates linear combinations. Notably, the pre-defined filters are a fixed component of the architecture, remaining unchanged during the training phase. Our findings reveal that the number of dimensions spanned by the set of pre-defined filters has a low impact on recognition performance. However, the size of the set of filters matters, with nine or more filters providing optimal results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Belkin, M., Hsu, D., Ma, S., Mandal, S.: Reconciling modern machine-learning practice and the classical bias-variance trade-off. Proc. Natl. Acad. Sci. 116(32), 15849–15854 (2019)

    Article  MathSciNet  Google Scholar 

  2. Gavrikov, P., Keuper, J.: CNN filter DB: an empirical investigation of trained convolutional filters. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 19044–19054. IEEE, New Orleans (2022)

    Google Scholar 

  3. Gavrikov, P., Keuper, J.: Rethinking 1x1 convolutions: can we train CNNs with frozen random filters? (2023). arXiv preprint arXiv:2301.11360 [cs]

  4. He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 1026–1034. IEEE, Santiago (2015)

    Google Scholar 

  5. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778. IEEE, Las Vegas (2016)

    Google Scholar 

  6. Hertel, L., Barth, E., Käster, T., Martinetz, T.: Deep convolutional neural networks as generic feature extractors. arXiv preprint arXiv:1710.02286 [cs] (2017)

  7. Howard, A.G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017)

  8. Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)

    Google Scholar 

  9. Krause, J., Stark, M., Deng, J., Fei-Fei, L.: 3D Object rpresentations for fine-grained categorization. In: 2013 IEEE International Conference on Computer Vision Workshops, pp. 554–561. IEEE, Sydney (2013)

    Google Scholar 

  10. Linse, C., Barth, E., Martinetz, T.: Convolutional neural networks do work with pre-defined filters. In: 2023 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2023)

    Google Scholar 

  11. Linse, C., Martinetz, T.: Large neural networks learning from scratch with very few data and without explicit regularization. In: Proceedings of the 2023 15th International Conference on Machine Learning and Computing (ICMLC 2023), pp. 279-283. Association for Computing Machinery, New York (2023)

    Google Scholar 

  12. Maji, S., Rahtu, E., Kannala, J., Blaschko, M., Vedaldi, A.: Fine-grained visual classification of aircraft. arXiv preprint arXiv:1306.5151 [cs] (2013)

  13. Nair, V., Hinton, G.E.: Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th International Conference on International Conference on Machine Learning (ICML 2010), pp. 807–814. Omnipress, Madison (2010), event-place: Haifa, Israel

    Google Scholar 

  14. Nilsback, M.E., Zisserman, A.: Automated flower classification over a large number of classes. In: 2008 Sixth Indian Conference on Computer Vision. Graphics and Image Processing, pp. 722–729. IEEE, Bhubaneswar (2008)

    Google Scholar 

  15. Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: Wallach, H., Larochelle, H., Beygelzimer, A., AlchÃ-Buc, F.d., Fox, E., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 32, pp. 8024–8035. Curran Associates, Inc. (2019)

    Google Scholar 

  16. Ramanujan, V., Wortsman, M., Kembhavi, A., Farhadi, A., Rastegari, M.: What’s hidden in a randomly weighted neural network? arXiv preprint arXiv:1911.13299 [cs] (2020)

  17. Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vision 115(3), 211–252 (2015)

    Google Scholar 

  18. Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The Caltech-UCSD Birds-200-2011 Dataset. Tech. Rep. CNS-TR-2011-001, California Institute of Technology (2011)

    Google Scholar 

  19. Wimmer, P., Mehnert, J., Condurache, A.: Interspace pruning: using adaptive filter representations to improve training of sparse CNNs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12527–12537 (2022)

    Google Scholar 

Download references

Acknowledgment

The work of Christoph Linse was supported by the Bundesministerium für Wirtschaft und Klimaschutz through the Mittelstand-Digital Zentrum Schleswig-Holstein Project.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christoph Linse .

Editor information

Editors and Affiliations

Appendix

Appendix

1.1 Training Hyperparameters for ImageNet

The networks are trained with a batch size of 48 on five NVIDIA GeForce RTX 4090 GPUs. The remaining training hyperparameters are taken from the training reference of PyTorch [15]. The cross-entropy loss is minimized using stochastic gradient descent for 90 epochs with a momentum of 0.9 and weight-decay 0.0001. The initial learning rate of 0.1 is reduced by a factor of 0.1 every 30 epochs.

1.2 Linear Independency of ReLU-Based Functions

Two functions \(f_1, f_2 : X \rightarrow Y\) are linearly independent if

$$\begin{aligned} (\forall \textbf{x} \in X: c_1 f_1(\textbf{x}) + c_2 f_2(\textbf{x}) = 0) \Leftrightarrow c_1 = c_2 = 0 . \end{aligned}$$
(9)

Consider the functions from (7) that occur in the \(\text {PFM}\) module. The functions are linearly dependent if the pre-defined filters \(\tilde{w}_1\) and \(\tilde{w}_2\) are linearly dependent and \(a \tilde{w}_1 = \tilde{w}_2, a \ge 0\).

Proof

Choose some arbitrary \(\textbf{x} \in \mathbb {R}^{M \times N}\). Choose \(c_1 \in \mathbb {R} \backslash \{0\}\) and \(c_2 = - c_1 / a\). Then,

$$\begin{aligned} {\begin{matrix} & c_1 f^{(\tilde{w}_1, m, n)}(\textbf{x}) + c_2 f^{(\tilde{w}_2, m, n)}(\textbf{x}) \\ & = c_1 \text {ReLU}(\tilde{w}_1 * \textbf{x})[m,n] - c_1 \frac{a}{a} \text {ReLU}(\tilde{w}_1 * \textbf{x})[m,n] = 0 . \\ \end{matrix}} \end{aligned}$$
(10)

      \(\Box \)

The functions in (7) are linearly independent if \(\tilde{w}_1\) and \(\tilde{w}_2\) are linearly dependent and \(a \tilde{w}_1 = \tilde{w}_2, a < 0\).

Proof

The \(\leftarrow \) direction is clear. To show the \(\rightarrow \) direction, let \(\textbf{x} \in \mathbb {R}^{M \times N}\).

$$\begin{aligned} {\begin{matrix} & c_1 f^{(\tilde{w}_1, m, n)}(\textbf{x}) + c_2 f^{(\tilde{w}_2, m, n)}(\textbf{x}) = 0 \\ & \Leftrightarrow c_1 \text {ReLU}(\tilde{w}_1 * \textbf{x})[m,n] + c_2 \text {ReLU}(a \tilde{w}_1 * \textbf{x})[m,n] = 0 \\ & \Leftrightarrow c_1 \text {ReLU}(\tilde{w}_1 * \textbf{x})[m,n] - a c_2 \text {ReLU}(-\tilde{w}_1 * \textbf{x})[m,n] = 0 \\ & \text {Case1}: (\tilde{w}_1 * \textbf{x})[m,n] \ge 0 \implies c_1 = 0\\ & \text {Case2}: (\tilde{w}_1 * \textbf{x})[m,n] < 0 \implies c_2 = 0\\ \end{matrix}} \end{aligned}$$
(11)

The sum has to be zero for all \(\textbf{x} \in \mathbb {R}^{M \times N}\). This means that both coefficients \(c_1\) and \(c_2\) have to be zero.       \(\Box \)

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Linse, C., Brückner, B., Martinetz, T. (2024). Enhancing Generalization in Convolutional Neural Networks Through Regularization with Edge and Line Features. In: Wand, M., Malinovská, K., Schmidhuber, J., Tetko, I.V. (eds) Artificial Neural Networks and Machine Learning – ICANN 2024. ICANN 2024. Lecture Notes in Computer Science, vol 15016. Springer, Cham. https://doi.org/10.1007/978-3-031-72332-2_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-72332-2_28

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-72331-5

  • Online ISBN: 978-3-031-72332-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics