Skip to main content
Log in

Gish: a novel activation function for image classification

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

In Convolutional Neural Networks (CNNs), the selection and use of appropriate activation functions is of critical importance. It has been seen that the Rectified Linear Unit (ReLU) is widely used in many CNN models. Looking at the recent studies, it has been seen that some non-monotonic activation functions are gradually moving towards becoming the new standard to improve the performance of CNN models. It has been observed that some non-monotonic activation functions such as Swish, Mish, Logish and Smish are used to obtain successful results in various deep learning models. However, only a few of them have been widely used in most of the studies. Inspired by them, in this study, a new activation function named Gish, whose mathematical model can be represented by \(y=x\cdot ln(2-{e}^{{-e}^{x}})\), which can overcome other activation functions with its good properties, is proposed. The variable \(x\) is used to contribute to a strong regulation effect of negative output. The logarithm operation is done to reduce the numerical range of the expression \((2-{e}^{{-e}^{x}})\). To present our contributions in this work, various experiments were conducted on different network models and datasets to evaluate the performance of Gish. With the experimental results, 98.7% success was achieved with the EfficientNetB4 model in the MNIST dataset, 86.5% with the EfficientNetB5 model in the CIFAR-10 dataset and 90.8% with the EfficientNetB6 model in the SVHN dataset. The obtained performances were shown to be higher than Swish, Mish, Logish and Smish. These results confirm the effectiveness and performance of Gish.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data availability

Datasets derived from public resources and made available with the article.

References

  1. Sarker IH (2021) Deep learning: a comprehensive overview on techniques, taxonomy, applications and research directions. SN Comput Sci 2:420. https://doi.org/10.1007/s42979-021-00815-1

    Article  Google Scholar 

  2. Kiliçarslan S, Celik M (2021) RSigELU: a nonlinear activation function for deep neural networks. Expert Syst Appl 174:114805. https://doi.org/10.1016/j.eswa.2021.114805

    Article  Google Scholar 

  3. Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60(6):84–90. https://doi.org/10.1145/3065386

    Article  Google Scholar 

  4. Jogunola O, Adebisi B, Hoang KV, Tsado Y, Popoola SI, Hammoudeh M, Nawaz R (2022) CBLSTM-AE: a hybrid deep learning framework for predicting energy consumption. Energies 15(3):810. https://doi.org/10.3390/en15030810

    Article  Google Scholar 

  5. Ertuğrul ÖF, Akıl MF (2022) Detecting hemorrhage types and bounding box of hemorrhage by deep learning. Biomed Signal Process Control 71:103085. https://doi.org/10.1016/j.bspc.2021.103085

    Article  Google Scholar 

  6. Zhou Y, Li D, Huo S, Kung S-Y (2021) Shape autotuning activation function. Expert Syst Appl 171:114534. https://doi.org/10.1016/j.eswa.2020.114534

    Article  Google Scholar 

  7. Koçak Y, Şiray GÜ (2021) New activation functions for single layer feedforward neural network. Expert Syst Appl 164:113977. https://doi.org/10.1016/j.eswa.2020.113977

    Article  Google Scholar 

  8. Khan IU, Afzal S, Lee JW (2022) Human activity recognition via hybrid deep learning based model. Sensors 22(1):323. https://doi.org/10.3390/s22010323

    Article  Google Scholar 

  9. Parisi L, Ma R, RaviChandran N, Lanzillotta M (2021) hyper-sinh: an accurate and reliable function from shallow to deep learning in TensorFlow and Keras. Mach Learn Appl 6:100112. https://doi.org/10.1016/j.mlwa.2021.100112

    Article  Google Scholar 

  10. Yousaf K, Nawaz T (2022) A deep learning-based approach for inappropriate content detection and classification of youtube videos. IEEE Access 10:16283–16298. https://doi.org/10.1109/access.2022.3147519

    Article  Google Scholar 

  11. Dhiman C, Vishwakarma DK (2020) View-invariant deep architecture for human action recognition using two-stream motion and shape temporal dynamics. IEEE Trans Image Process 29:3835–3844. https://doi.org/10.1109/TIP.2020.2965299

    Article  MATH  Google Scholar 

  12. Alwassel H, Giancola S, Ghanem B (2021) TSP: temporally-sensitive pretraining of video encoders for localization tasks. In: IEEE/CVF international conference on computer vision workshops (ICCVW). Montreal, BC, Canada, pp 3166–3176. https://doi.org/10.1109/ICCVW54120.2021.00356

  13. Dhiman C, Vishwakarma DK, Agarwal P (2021) Part-wise spatio-temporal attention driven CNN-based 3D human action recognition. ACM Trans Multimed Comput Commun Appl 17(3):1–24. https://doi.org/10.1145/3441628

    Article  Google Scholar 

  14. Klein S, Pluim JPW, Staring M, Viergever MA (2009) Adaptive stochastic gradient descent optimisation for image registration. Int J Comput Vision 81:227–239. https://doi.org/10.1007/s11263-008-0168-y

    Article  MATH  Google Scholar 

  15. Huang GB, Babri HA (1998) Upper bounds on the number of hidden neurons in feedforward networks with arbitrary bounded nonlinear activation functions. IEEE Trans Neural Netw 9(1):224–229. https://doi.org/10.1109/72.655045

    Article  Google Scholar 

  16. Yuen B, Hoang MT, Dong X, Lu T (2021) Universal activation function for machine learning. Sci Rep 11(1):18757. https://doi.org/10.1038/s41598-021-96723-8

    Article  Google Scholar 

  17. Marder E, Taylor AL (2011) Multiple models to capture the variability in biological neurons and networks. Nat Neurosci 14(2):133–138. https://doi.org/10.1038/nn.2735

    Article  Google Scholar 

  18. Matsuda S (2016) BPSpike: a backpropagation learning for all parameters in spiking neural networks with multiple layers and multiple spikes. In: IEEE international joint conference on neural networks (IJCNN). Vancouver, BC, Canada, pp 293–298. https://doi.org/10.1109/IJCNN.2016.7727211

  19. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition (CVPR). Las Vegas, NV, USA, pp 770–778. https://doi.org/10.1109/CVPR.2016.90

  20. Liang X, Xu J (2021) Biased ReLU neural networks. Neurocomputing 423:71–79. https://doi.org/10.1016/j.neucom.2020.09.050

    Article  Google Scholar 

  21. Han J, Moraga C (1995) The influence of the sigmoid function parameters on the speed of backpropagation learning. In: Mira J, Sandoval F (eds) From natural artificial neural computation international workshop on artificial neural networks (IWANN). Springer, Heidelberg, pp 195–201. https://doi.org/10.1007/3-540-59497-3_175

    Chapter  Google Scholar 

  22. Oh J, Kim S, Lee C, Cha J-H, Yang SY, Im SG, Park C, Jang BC, Choi S-Y (2023) Preventing vanishing gradient problem of hardware neuromorphic system by implementing imidazole-based memristive ReLU activation neuron. Adv Mater 35(24):2300023. https://doi.org/10.1002/adma.202300023

    Article  Google Scholar 

  23. Harrington PB (1993) Sigmoid transfer functions in backpropagation neural networks. Anal Chem 65(15):2167–2168. https://doi.org/10.1021/ac00063a042

    Article  Google Scholar 

  24. Hahnloser RHR, Seung HS, Slotine J-J (2003) Permitted and forbidden sets in symmetric threshold-linear networks. Neural Comput 15(3):621–638. https://doi.org/10.1162/089976603321192103

    Article  MATH  Google Scholar 

  25. Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th international conference on machine learning (ICML). Omnipress, Madison, WI, USA, pp 807–814. https://doi.org/10.5555/3104322.3104425

  26. Courbariaux M, Bengio Y, David J-P (2015) BinaryConnect: training deep neural networks with binary weights during propagations. In: Proceedings of the 28th international conference on neural information processing systems (NIPS). MIT Press, Cambridge, MA, USA, 2:3123–3131. https://doi.org/10.5555/2969442.2969588

  27. Gulcehre C, Moczulski M, Denil M, Bengio Y (2016) Noisy activation functions. arXiv preprint arXiv:1603.00391v3. https://doi.org/10.48550/arXiv.1603.00391

  28. Maas AL, Hannun AY, Ng AY (2013) Rectifier nonlinearities improve neural network acoustic models. In: Proceedings of the 30th international conference on machine learning (ICML) vol 28, Atlanta, Georgia, USA. https://ai.stanford.edu/~amaas/papers/relu_hybrid_icml2013_final.pdf

  29. Clevert D-A, Unterthiner T, Hochreiter S (2016) Fast and Accurate deep network learning by exponential linear units (ELUs). arXiv preprint arXiv:1511.07289. https://doi.org/10.48550/arXiv.1511.07289

  30. Wang X, Qin Y, Wang Y, Xiang S, Chen H (2019) ReLTanh: an activation function with vanishing gradient resistance for SAE-based DNNs and its application to rotating machinery fault diagnosis. Neurocomputing 363:88–98. https://doi.org/10.1016/j.neucom.2019.07.017

    Article  Google Scholar 

  31. Dubey SR, Singh SK, Chaudhuri BB (2022) Activation functions in deep learning: a comprehensive survey and benchmark. Neurocomputing 503:92–108. https://doi.org/10.1016/j.neucom.2022.06.111

    Article  Google Scholar 

  32. Qin Y, Wang X, Zou J (2019) The optimized deep belief networks with improved logistic sigmoid units and their application in fault diagnosis for planetary gearboxes of wind turbines. IEEE Trans Ind Electron 66(5):3814–3824. https://doi.org/10.1109/tie.2018.2856205

    Article  Google Scholar 

  33. Ren Z, Zhu Y, Yan K, Chen K, Kang W, Yue Y, Gao D (2020) A novel model with the ability of few-shot learning and quick updating for intelligent fault diagnosis. Mech Syst Signal Process 138:106608. https://doi.org/10.1016/j.ymssp.2019.106608

    Article  Google Scholar 

  34. Ramachandran P, Zoph B, Le QV (2017) Searching for activation functions. arXiv preprint arXiv: 1710.05941v2. https://doi.org/10.48550/arXiv.1710.05941

  35. Athlur S, Saran N, Sivathanu M, Ramjee R, Kwatra N (2022) Varuna: scalable, low-cost training of massive deep learning models. In: Proceedings of the seventeenth european conference on computer systems (EuroSys’22). Association for computing machinery, New York, NY, USA, pp 472–487. https://doi.org/10.1145/3492321.3519584

  36. Pacal I, Karaman A, Karaboga D, Akay B, Basturk A, Nalbantoglu U, Coskun S (2022) An efficient real-time colonic polyp detection with YOLO algorithms trained by using negative samples and large datasets. Comput Biol Med 141:105031. https://doi.org/10.1016/j.compbiomed.2021.105031

    Article  Google Scholar 

  37. Sendjasni A, Traparic D, Larabi M-C (2022) Investigating normalization methods for CNN-based image quality assessment. In: IEEE international conference on image processing (ICIP). Bordeaux, France, pp 4113–4117. https://doi.org/10.1109/ICIP46576.2022.9897268

  38. Misra D (2019) Mish: A self regularized non-monotonic activation function. arXiv preprint arXiv:arXiv:1908.08681v3. https://doi.org/10.48550/arXiv.1908.08681

  39. Zhu H, Zeng H, Liu J, Zhang X (2021) Logish: a new nonlinear nonmonotonic activation function for convolutional neural network. Neurocomputing 458:490–499. https://doi.org/10.1016/j.neucom.2021.06.067

    Article  Google Scholar 

  40. Sun T, Ding S, Guo L (2022) Low-degree term first in ResNet, its variants and the whole neural network family. Neural Netw 148:155–165. https://doi.org/10.1016/j.neunet.2022.01.012

    Article  Google Scholar 

  41. Yin L, Hong P, Zheng G, Chen H, Deng W (2022) A novel image recognition method based on DenseNet and DPRN. Appl Sci 12(9):4232. https://doi.org/10.3390/app12094232

    Article  Google Scholar 

  42. Wang X, Ren H, Wang A (2022) Smish: a novel activation function for deep learning methods. Electronics 11(4):540. https://doi.org/10.3390/electronics11040540

    Article  Google Scholar 

  43. Boob D, Dey SS, Lan G (2022) Complexity of training ReLU neural network. Discret Optim 44(1):100620. https://doi.org/10.1016/j.disopt.2020.100620

    Article  MathSciNet  MATH  Google Scholar 

  44. Sharma O (2022) Exploring the statistical properties and developing a non-linear activation function. In: IEEE international conference on automation, computing and renewable systems (ICACRS) pp 1370–1375, Pudukkottai, India. https://doi.org/10.1109/ICACRS55517.2022.10029124

  45. Asghari M, Fathollahi-Fard AM, Mirzapour Al-e-hashem SMJ, Dulebenets MA (2022) Transformation and linearization techniques in optimization: a state-of-the-art survey. Mathematics 10(2):283. https://doi.org/10.3390/math10020283

    Article  Google Scholar 

  46. Tian Y, Zhang Y, Zhang H (2023) Recent advances in stochastic gradient descent in deep learning. Mathematics 11(3):682. https://doi.org/10.3390/math11030682

    Article  Google Scholar 

  47. Tan Z, Chen H (2023) Nonlinear function activated GNN versus ZNN for online solution of general linear matrix equations. J Franklin Inst 360(10):7021–7036. https://doi.org/10.1016/j.jfranklin.2023.05.007

    Article  MathSciNet  MATH  Google Scholar 

  48. Kurtz M, Kopinsky J, Gelashvili R, Matveev A, Carr J, Goin M, Leiserson W, Moore S, Nell B, Shavit N, Alistarh D (2020) Inducing and exploiting activation sparsity for fast neural network inference. In: Proceedings of the 37th international conference on machine learning (ICML). https://doi.org/10.5555/3524938.3525451

  49. SciPy User Guide (2022) Gumbel left-skewed (for minimum order statistic) distribution—SciPy v1.7.1 Manual. The SciPy community. https://docs.scipy.org/doc/scipy/tutorial/stats/continuous_gumbel_l.html. Accessed 02 Jan 2022

  50. Jahan I, Ahmed MF, Ali MO, Jang YM (2023) Self-gated rectified linear unit for performance improvement of deep neural networks. ICT Express 9(3):320–325. https://doi.org/10.1016/j.icte.2021.12.012

    Article  Google Scholar 

  51. Sun Y (2021) The role of activation function in image classification. In: International Conference on communications, information system and computer engineering (CISCE), Beijing, China pp 275–278. https://doi.org/10.1109/CISCE52179.2021.9445868

  52. Mercioni MA, Tat AM, Holban S (2020) Improving the Accuracy of deep neural networks through developing new activation functions. In: IEEE 16th international conference on intelligent computer communication and processing (ICCP), pp 385–391. https://doi.org/10.1109/ICCP51029.2020.9266162

  53. Hendrycks D, Gimpel K (2016) Gaussian error linear units (GELUs). arXiv preprint arXiv:1606.08415v5. https://doi.org/10.48550/arXiv.1606.08415

  54. Piotrowski AP, Napiorkowski JJ, Piotrowska AE (2020) Impact of deep learning-based dropout on shallow neural networks applied to stream temperature modelling. Earth Sci Rev 201:103076. https://doi.org/10.1016/j.earscirev.2019.103076

    Article  Google Scholar 

  55. Krueger D, Maharaj T, Kramár J, Pezeshki M, Ballas N, Ke NR, Goyal A, Bengio Y, Courville A, Pal C (2017) Zoneout: regularizing RNNs by randomly preserving hidden activations. arXiv preprint arXiv:1606.01305v4. https://doi.org/10.48550/arXiv.1606.01305

  56. Keras: Deep Learning for humans (2022) https://keras.io/. Accessed 03 Feb 2022

  57. TensorFlow (2022) https://www.tensorflow.org/. Accessed 14 Mar 2022

  58. Torch Scientific computing for LuaJIT (2022) http://torch.ch/. Accessed 20 Apr 2022

  59. PyTorch (2022) https://pytorch.org/. Accessed 21 May 2022

  60. Xing C, Zhao J, Duan C, Wang Z, Wang M (2022) Deep encoder with kernel-wise Taylor series for hyperspectral image classification. IEEE Trans Geosci Remote Sens 60:1–13. https://doi.org/10.1109/tgrs.2022.3216732

    Article  Google Scholar 

  61. Mahaboob B, Venkateswararao P, Kumar PSP, Sarma SVM, Reddy SR, Krishna YH (2021) A review article on mathematical aspects of nonlinear models. Turk J Comput Math Educ (TURCOMAT) 12(10):5991–6010. https://doi.org/10.17762/turcomat.v12i10.5422

    Article  Google Scholar 

  62. Sealey V, Infante N, Campbell MP, Bolyard J (2020) The generation and use of graphical examples in calculus classrooms: the case of the mean value theorem. J Math Behav 57:100743. https://doi.org/10.1016/j.jmathb.2019.100743

    Article  Google Scholar 

  63. Vyas V, Jiang-Wei L, Zhou P, Hu X, Friedman JS (2021) Karnaugh map method for memristive and spintronic asymmetric basis logic functions. IEEE Trans Comput 70(1):128–138. https://doi.org/10.1109/tc.2020.2986970

    Article  Google Scholar 

  64. Yang T, Wei Y, Tu Z, Zeng H, Kinsy MA, Zheng N, Ren P (2019) Design space exploration of neural network activation function circuits. IEEE Trans Comput Aided Des Integr Circuits Syst 38(10):1974–1978. https://doi.org/10.1109/tcad.2018.2871198

    Article  Google Scholar 

  65. Elbrächter D, Perekrestenko D, Grohs P, Bölcskei H (2021) Deep neural network approximation theory. IEEE Trans Inf Theory 67(5):2581–2623. https://doi.org/10.1109/tit.2021.3062161

    Article  MathSciNet  MATH  Google Scholar 

  66. Khan AH, Cao X, Li S, Katsikis VN, Liao L (2020) BAS-ADAM: an ADAM based approach to improve the performance of beetle antennae search optimizer. IEEE/CAA J Autom Sin 7(2):461–471. https://doi.org/10.1109/jas.2020.1003048

    Article  Google Scholar 

  67. He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: IEEE international conference on computer vision (ICCV) pp 1026–1034. https://doi.org/10.1109/ICCV.2015.123

  68. Tan M, Le QV (2019) EfficientNet: rethinking model scaling for convolutional neural networks. arXiv preprint arXiv:1905.11946v5. https://doi.org/10.48550/arXiv.1905.11946

  69. Kandel I, Castelli M (2020) The effect of batch size on the generalizability of the convolutional neural networks on a histopathology dataset. ICT Express 6(4):312–315. https://doi.org/10.1016/j.icte.2020.04.010

    Article  Google Scholar 

  70. Gao Y, Liu Y, Zhang H, Li Z, Zhu Y, Lin H, Yang M (2020) Estimating GPU memory consumption of deep learning models. In: Proceedings of the 28th ACM joint meeting on European software engineering conference and symposium on the foundations of software engineering (ESEC/FSE). New York, NY, USA pp 1342–1352. https://doi.org/10.1145/3368089.3417050

  71. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958. https://doi.org/10.5555/2627435.2670313

    Article  MathSciNet  MATH  Google Scholar 

  72. Arpit D, Campos V, Bengio Y (2019) How to initialize your network? Robust initialization for WeightNorm and ResNets. In: Proceedings of the 33rd international conference on neural information processing systems. Curran Associates Inc., Red Hook, NY, USA, Article 978, pp 10902–10911. https://doi.org/10.5555/3454287.3455265

  73. Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics, PMLR, 9:249–256. https://proceedings.mlr.press/v9/glorot10a/glorot10a.pdf

  74. Saxe AM, McClelland JL, Ganguli S (2014) Exact solutions to the nonlinear dynamics of learning in deep linear neural networks. arXiv preprint arXiv:1312.6120v3. https://doi.org/10.48550/arXiv.1312.6120

  75. Smith SL, Kindermans P-J, Ying C, Le QV (2018) Don’t decay the learning rate, increase the batch size. arXiv preprint arXiv:1711.00489v2. https://doi.org/10.48550/arXiv.1711.00489

  76. Mustika IW, Adi HN, Najib F (2021) Comparison of Keras optimizers for earthquake signal classification based on deep neural networks. In: IEEE 4th international conference on information and communications technology (ICOIACT) pp 304–308. https://doi.org/10.1109/ICOIACT53268.2021.9563990

  77. Duchi J, Hazan E, Singer Y (2011) Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res (JMLR) 12:2121–2159. https://doi.org/10.5555/1953048.2021068

    Article  MathSciNet  MATH  Google Scholar 

  78. Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980v9. https://doi.org/10.48550/arXiv.1412.6980

  79. Dogo EM, Afolabi OJ, Nwulu NI, Twala B, Aigbavboa CO (2018) A comparative analysis of gradient descent-based optimization algorithms on convolutional neural networks. In: International conference on computational techniques, electronics and mechanical systems (CTEMS) pp 92–99. https://doi.org/10.1109/CTEMS.2018.8769211

  80. Sutskever I, Martens J, Dahl G, Hinton G (2013) On the importance of initialization and momentum in deep learning. In: Proceedings of the 30th international conference on machine learning (ICML) 28:1139–1147. https://doi.org/10.5555/3042817.3043064

  81. Xu D, Zhang S, Zhang H, Mandic DP (2021) Convergence of the RMSProp deep learning method with penalty for nonconvex optimization. Neural Netw 139:17–23. https://doi.org/10.1016/j.neunet.2021.02.011

    Article  MATH  Google Scholar 

  82. Nusrat I, Jang S-B (2018) A comparison of regularization techniques in deep neural networks. Symmetry 10(11):648. https://doi.org/10.3390/sym10110648

    Article  Google Scholar 

  83. Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324. https://doi.org/10.1109/5.726791

    Article  Google Scholar 

  84. CIFAR-10 and CIFAR-100 datasets (2022) https://www.cs.toronto.edu/~kriz/cifar.html. Accessed 29 June 2022

  85. Netzer Y, Wang T, Coates A, Bissacco A, Wu B, Ng AY (2011) Reading digits in natural images with unsupervised feature learning. In: Neural information processing systems (NIPS) workshop on deep learning and unsupervised feature learning. http://ufldl.stanford.edu/housenumbers/nips2011_housenumbers.pdf

  86. Farda NA, Lai J-Y, Wang J-C, Lee P-Y, Liu J-W, Hsieh I-H (2021) Sanders classification of calcaneal fractures in CT images with deep learning and differential data augmentation techniques. Injury 52(3):616–624. https://doi.org/10.1016/j.injury.2020.09.010

    Article  Google Scholar 

  87. Xiao H, Rasul K, Vollgraf R (2017) Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747v2. https://doi.org/10.48550/arXiv.1708.07747

  88. Noel MM, Trivedi A, Dutta P (2023) Growing cosine unit: a novel oscillatory activation function that can speedup training and reduce parameters in convolutional neural networks. arXiv preprint arXiv:2108.12943v3. https://doi.org/10.48550/arXiv.2108.12943

  89. Zhu M, Min W, Wang Q, Zou S, Chen X (2021) PFLU and FPFLU: two novel non-monotonic activation functions in convolutional neural networks. Neurocomputing 429:110–117. https://doi.org/10.1016/j.neucom.2020.11.068

    Article  Google Scholar 

  90. Mercioni MA, Holban S (2021) Soft-clipping swish: a novel activation function for deep learning. In: IEEE 15th international symposium on applied computational intelligence and informatics (SACI), Timisoara, Romania. https://doi.org/10.1109/SACI51354.2021.9465622

  91. Liu X, Di X (2021) TanhExp: a smooth activation function with high convergence speed for lightweight neural networks. IET Comput Vision 15(2):136–150. https://doi.org/10.1049/cvi2.12020

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mustafa Kaytan.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kaytan, M., Aydilek, İ.B. & Yeroğlu, C. Gish: a novel activation function for image classification. Neural Comput & Applic 35, 24259–24281 (2023). https://doi.org/10.1007/s00521-023-09035-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-023-09035-5

Keywords

Navigation