Gish: a novel activation function for image classification

Kaytan, Mustafa; Aydilek, İbrahim Berkan; Yeroğlu, Celaleddin

doi:10.1007/s00521-023-09035-5

Gish: a novel activation function for image classification

Original Article
Published: 27 September 2023

Volume 35, pages 24259–24281, (2023)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

434 Accesses
Explore all metrics

Abstract

In Convolutional Neural Networks (CNNs), the selection and use of appropriate activation functions is of critical importance. It has been seen that the Rectified Linear Unit (ReLU) is widely used in many CNN models. Looking at the recent studies, it has been seen that some non-monotonic activation functions are gradually moving towards becoming the new standard to improve the performance of CNN models. It has been observed that some non-monotonic activation functions such as Swish, Mish, Logish and Smish are used to obtain successful results in various deep learning models. However, only a few of them have been widely used in most of the studies. Inspired by them, in this study, a new activation function named Gish, whose mathematical model can be represented by \(y=x\cdot ln(2-{e}^{{-e}^{x}})\), which can overcome other activation functions with its good properties, is proposed. The variable \(x\) is used to contribute to a strong regulation effect of negative output. The logarithm operation is done to reduce the numerical range of the expression \((2-{e}^{{-e}^{x}})\). To present our contributions in this work, various experiments were conducted on different network models and datasets to evaluate the performance of Gish. With the experimental results, 98.7% success was achieved with the EfficientNetB4 model in the MNIST dataset, 86.5% with the EfficientNetB5 model in the CIFAR-10 dataset and 90.8% with the EfficientNetB6 model in the SVHN dataset. The obtained performances were shown to be higher than Swish, Mish, Logish and Smish. These results confirm the effectiveness and performance of Gish.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Novel Activation Function in Convolutional Neural Network for Image Classification in Deep Learning

A Convolutional Neural Network Model Based on Improved Softplus Activation Function

αSechSig and αTanhSig: two novel non-monotonic activation functions

Article 06 October 2023

Data availability

Datasets derived from public resources and made available with the article.

References

Sarker IH (2021) Deep learning: a comprehensive overview on techniques, taxonomy, applications and research directions. SN Comput Sci 2:420. https://doi.org/10.1007/s42979-021-00815-1
Article Google Scholar
Kiliçarslan S, Celik M (2021) RSigELU: a nonlinear activation function for deep neural networks. Expert Syst Appl 174:114805. https://doi.org/10.1016/j.eswa.2021.114805
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60(6):84–90. https://doi.org/10.1145/3065386
Article Google Scholar
Jogunola O, Adebisi B, Hoang KV, Tsado Y, Popoola SI, Hammoudeh M, Nawaz R (2022) CBLSTM-AE: a hybrid deep learning framework for predicting energy consumption. Energies 15(3):810. https://doi.org/10.3390/en15030810
Article Google Scholar
Ertuğrul ÖF, Akıl MF (2022) Detecting hemorrhage types and bounding box of hemorrhage by deep learning. Biomed Signal Process Control 71:103085. https://doi.org/10.1016/j.bspc.2021.103085
Article Google Scholar
Zhou Y, Li D, Huo S, Kung S-Y (2021) Shape autotuning activation function. Expert Syst Appl 171:114534. https://doi.org/10.1016/j.eswa.2020.114534
Article Google Scholar
Koçak Y, Şiray GÜ (2021) New activation functions for single layer feedforward neural network. Expert Syst Appl 164:113977. https://doi.org/10.1016/j.eswa.2020.113977
Article Google Scholar
Khan IU, Afzal S, Lee JW (2022) Human activity recognition via hybrid deep learning based model. Sensors 22(1):323. https://doi.org/10.3390/s22010323
Article Google Scholar
Parisi L, Ma R, RaviChandran N, Lanzillotta M (2021) hyper-sinh: an accurate and reliable function from shallow to deep learning in TensorFlow and Keras. Mach Learn Appl 6:100112. https://doi.org/10.1016/j.mlwa.2021.100112
Article Google Scholar
Yousaf K, Nawaz T (2022) A deep learning-based approach for inappropriate content detection and classification of youtube videos. IEEE Access 10:16283–16298. https://doi.org/10.1109/access.2022.3147519
Article Google Scholar
Dhiman C, Vishwakarma DK (2020) View-invariant deep architecture for human action recognition using two-stream motion and shape temporal dynamics. IEEE Trans Image Process 29:3835–3844. https://doi.org/10.1109/TIP.2020.2965299
Article MATH Google Scholar
Alwassel H, Giancola S, Ghanem B (2021) TSP: temporally-sensitive pretraining of video encoders for localization tasks. In: IEEE/CVF international conference on computer vision workshops (ICCVW). Montreal, BC, Canada, pp 3166–3176. https://doi.org/10.1109/ICCVW54120.2021.00356
Dhiman C, Vishwakarma DK, Agarwal P (2021) Part-wise spatio-temporal attention driven CNN-based 3D human action recognition. ACM Trans Multimed Comput Commun Appl 17(3):1–24. https://doi.org/10.1145/3441628
Article Google Scholar
Klein S, Pluim JPW, Staring M, Viergever MA (2009) Adaptive stochastic gradient descent optimisation for image registration. Int J Comput Vision 81:227–239. https://doi.org/10.1007/s11263-008-0168-y
Article MATH Google Scholar
Huang GB, Babri HA (1998) Upper bounds on the number of hidden neurons in feedforward networks with arbitrary bounded nonlinear activation functions. IEEE Trans Neural Netw 9(1):224–229. https://doi.org/10.1109/72.655045
Article Google Scholar
Yuen B, Hoang MT, Dong X, Lu T (2021) Universal activation function for machine learning. Sci Rep 11(1):18757. https://doi.org/10.1038/s41598-021-96723-8
Article Google Scholar
Marder E, Taylor AL (2011) Multiple models to capture the variability in biological neurons and networks. Nat Neurosci 14(2):133–138. https://doi.org/10.1038/nn.2735
Article Google Scholar
Matsuda S (2016) BPSpike: a backpropagation learning for all parameters in spiking neural networks with multiple layers and multiple spikes. In: IEEE international joint conference on neural networks (IJCNN). Vancouver, BC, Canada, pp 293–298. https://doi.org/10.1109/IJCNN.2016.7727211
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: IEEE conference on computer vision and pattern recognition (CVPR). Las Vegas, NV, USA, pp 770–778. https://doi.org/10.1109/CVPR.2016.90
Liang X, Xu J (2021) Biased ReLU neural networks. Neurocomputing 423:71–79. https://doi.org/10.1016/j.neucom.2020.09.050
Article Google Scholar
Han J, Moraga C (1995) The influence of the sigmoid function parameters on the speed of backpropagation learning. In: Mira J, Sandoval F (eds) From natural artificial neural computation international workshop on artificial neural networks (IWANN). Springer, Heidelberg, pp 195–201. https://doi.org/10.1007/3-540-59497-3_175
Chapter Google Scholar
Oh J, Kim S, Lee C, Cha J-H, Yang SY, Im SG, Park C, Jang BC, Choi S-Y (2023) Preventing vanishing gradient problem of hardware neuromorphic system by implementing imidazole-based memristive ReLU activation neuron. Adv Mater 35(24):2300023. https://doi.org/10.1002/adma.202300023
Article Google Scholar
Harrington PB (1993) Sigmoid transfer functions in backpropagation neural networks. Anal Chem 65(15):2167–2168. https://doi.org/10.1021/ac00063a042
Article Google Scholar
Hahnloser RHR, Seung HS, Slotine J-J (2003) Permitted and forbidden sets in symmetric threshold-linear networks. Neural Comput 15(3):621–638. https://doi.org/10.1162/089976603321192103
Article MATH Google Scholar
Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th international conference on machine learning (ICML). Omnipress, Madison, WI, USA, pp 807–814. https://doi.org/10.5555/3104322.3104425
Courbariaux M, Bengio Y, David J-P (2015) BinaryConnect: training deep neural networks with binary weights during propagations. In: Proceedings of the 28th international conference on neural information processing systems (NIPS). MIT Press, Cambridge, MA, USA, 2:3123–3131. https://doi.org/10.5555/2969442.2969588
Gulcehre C, Moczulski M, Denil M, Bengio Y (2016) Noisy activation functions. arXiv preprint arXiv:1603.00391v3. https://doi.org/10.48550/arXiv.1603.00391
Maas AL, Hannun AY, Ng AY (2013) Rectifier nonlinearities improve neural network acoustic models. In: Proceedings of the 30th international conference on machine learning (ICML) vol 28, Atlanta, Georgia, USA. https://ai.stanford.edu/~amaas/papers/relu_hybrid_icml2013_final.pdf
Clevert D-A, Unterthiner T, Hochreiter S (2016) Fast and Accurate deep network learning by exponential linear units (ELUs). arXiv preprint arXiv:1511.07289. https://doi.org/10.48550/arXiv.1511.07289
Wang X, Qin Y, Wang Y, Xiang S, Chen H (2019) ReLTanh: an activation function with vanishing gradient resistance for SAE-based DNNs and its application to rotating machinery fault diagnosis. Neurocomputing 363:88–98. https://doi.org/10.1016/j.neucom.2019.07.017
Article Google Scholar
Dubey SR, Singh SK, Chaudhuri BB (2022) Activation functions in deep learning: a comprehensive survey and benchmark. Neurocomputing 503:92–108. https://doi.org/10.1016/j.neucom.2022.06.111
Article Google Scholar
Qin Y, Wang X, Zou J (2019) The optimized deep belief networks with improved logistic sigmoid units and their application in fault diagnosis for planetary gearboxes of wind turbines. IEEE Trans Ind Electron 66(5):3814–3824. https://doi.org/10.1109/tie.2018.2856205
Article Google Scholar
Ren Z, Zhu Y, Yan K, Chen K, Kang W, Yue Y, Gao D (2020) A novel model with the ability of few-shot learning and quick updating for intelligent fault diagnosis. Mech Syst Signal Process 138:106608. https://doi.org/10.1016/j.ymssp.2019.106608
Article Google Scholar
Ramachandran P, Zoph B, Le QV (2017) Searching for activation functions. arXiv preprint arXiv: 1710.05941v2. https://doi.org/10.48550/arXiv.1710.05941
Athlur S, Saran N, Sivathanu M, Ramjee R, Kwatra N (2022) Varuna: scalable, low-cost training of massive deep learning models. In: Proceedings of the seventeenth european conference on computer systems (EuroSys’22). Association for computing machinery, New York, NY, USA, pp 472–487. https://doi.org/10.1145/3492321.3519584
Pacal I, Karaman A, Karaboga D, Akay B, Basturk A, Nalbantoglu U, Coskun S (2022) An efficient real-time colonic polyp detection with YOLO algorithms trained by using negative samples and large datasets. Comput Biol Med 141:105031. https://doi.org/10.1016/j.compbiomed.2021.105031
Article Google Scholar
Sendjasni A, Traparic D, Larabi M-C (2022) Investigating normalization methods for CNN-based image quality assessment. In: IEEE international conference on image processing (ICIP). Bordeaux, France, pp 4113–4117. https://doi.org/10.1109/ICIP46576.2022.9897268
Misra D (2019) Mish: A self regularized non-monotonic activation function. arXiv preprint arXiv:arXiv:1908.08681v3. https://doi.org/10.48550/arXiv.1908.08681
Zhu H, Zeng H, Liu J, Zhang X (2021) Logish: a new nonlinear nonmonotonic activation function for convolutional neural network. Neurocomputing 458:490–499. https://doi.org/10.1016/j.neucom.2021.06.067
Article Google Scholar
Sun T, Ding S, Guo L (2022) Low-degree term first in ResNet, its variants and the whole neural network family. Neural Netw 148:155–165. https://doi.org/10.1016/j.neunet.2022.01.012
Article Google Scholar
Yin L, Hong P, Zheng G, Chen H, Deng W (2022) A novel image recognition method based on DenseNet and DPRN. Appl Sci 12(9):4232. https://doi.org/10.3390/app12094232
Article Google Scholar
Wang X, Ren H, Wang A (2022) Smish: a novel activation function for deep learning methods. Electronics 11(4):540. https://doi.org/10.3390/electronics11040540
Article Google Scholar
Boob D, Dey SS, Lan G (2022) Complexity of training ReLU neural network. Discret Optim 44(1):100620. https://doi.org/10.1016/j.disopt.2020.100620
Article MathSciNet MATH Google Scholar
Sharma O (2022) Exploring the statistical properties and developing a non-linear activation function. In: IEEE international conference on automation, computing and renewable systems (ICACRS) pp 1370–1375, Pudukkottai, India. https://doi.org/10.1109/ICACRS55517.2022.10029124
Asghari M, Fathollahi-Fard AM, Mirzapour Al-e-hashem SMJ, Dulebenets MA (2022) Transformation and linearization techniques in optimization: a state-of-the-art survey. Mathematics 10(2):283. https://doi.org/10.3390/math10020283
Article Google Scholar
Tian Y, Zhang Y, Zhang H (2023) Recent advances in stochastic gradient descent in deep learning. Mathematics 11(3):682. https://doi.org/10.3390/math11030682
Article Google Scholar
Tan Z, Chen H (2023) Nonlinear function activated GNN versus ZNN for online solution of general linear matrix equations. J Franklin Inst 360(10):7021–7036. https://doi.org/10.1016/j.jfranklin.2023.05.007
Article MathSciNet MATH Google Scholar
Kurtz M, Kopinsky J, Gelashvili R, Matveev A, Carr J, Goin M, Leiserson W, Moore S, Nell B, Shavit N, Alistarh D (2020) Inducing and exploiting activation sparsity for fast neural network inference. In: Proceedings of the 37th international conference on machine learning (ICML). https://doi.org/10.5555/3524938.3525451
SciPy User Guide (2022) Gumbel left-skewed (for minimum order statistic) distribution—SciPy v1.7.1 Manual. The SciPy community. https://docs.scipy.org/doc/scipy/tutorial/stats/continuous_gumbel_l.html. Accessed 02 Jan 2022
Jahan I, Ahmed MF, Ali MO, Jang YM (2023) Self-gated rectified linear unit for performance improvement of deep neural networks. ICT Express 9(3):320–325. https://doi.org/10.1016/j.icte.2021.12.012
Article Google Scholar
Sun Y (2021) The role of activation function in image classification. In: International Conference on communications, information system and computer engineering (CISCE), Beijing, China pp 275–278. https://doi.org/10.1109/CISCE52179.2021.9445868
Mercioni MA, Tat AM, Holban S (2020) Improving the Accuracy of deep neural networks through developing new activation functions. In: IEEE 16th international conference on intelligent computer communication and processing (ICCP), pp 385–391. https://doi.org/10.1109/ICCP51029.2020.9266162
Hendrycks D, Gimpel K (2016) Gaussian error linear units (GELUs). arXiv preprint arXiv:1606.08415v5. https://doi.org/10.48550/arXiv.1606.08415
Piotrowski AP, Napiorkowski JJ, Piotrowska AE (2020) Impact of deep learning-based dropout on shallow neural networks applied to stream temperature modelling. Earth Sci Rev 201:103076. https://doi.org/10.1016/j.earscirev.2019.103076
Article Google Scholar
Krueger D, Maharaj T, Kramár J, Pezeshki M, Ballas N, Ke NR, Goyal A, Bengio Y, Courville A, Pal C (2017) Zoneout: regularizing RNNs by randomly preserving hidden activations. arXiv preprint arXiv:1606.01305v4. https://doi.org/10.48550/arXiv.1606.01305
Keras: Deep Learning for humans (2022) https://keras.io/. Accessed 03 Feb 2022
TensorFlow (2022) https://www.tensorflow.org/. Accessed 14 Mar 2022
Torch Scientific computing for LuaJIT (2022) http://torch.ch/. Accessed 20 Apr 2022
PyTorch (2022) https://pytorch.org/. Accessed 21 May 2022
Xing C, Zhao J, Duan C, Wang Z, Wang M (2022) Deep encoder with kernel-wise Taylor series for hyperspectral image classification. IEEE Trans Geosci Remote Sens 60:1–13. https://doi.org/10.1109/tgrs.2022.3216732
Article Google Scholar
Mahaboob B, Venkateswararao P, Kumar PSP, Sarma SVM, Reddy SR, Krishna YH (2021) A review article on mathematical aspects of nonlinear models. Turk J Comput Math Educ (TURCOMAT) 12(10):5991–6010. https://doi.org/10.17762/turcomat.v12i10.5422
Article Google Scholar
Sealey V, Infante N, Campbell MP, Bolyard J (2020) The generation and use of graphical examples in calculus classrooms: the case of the mean value theorem. J Math Behav 57:100743. https://doi.org/10.1016/j.jmathb.2019.100743
Article Google Scholar
Vyas V, Jiang-Wei L, Zhou P, Hu X, Friedman JS (2021) Karnaugh map method for memristive and spintronic asymmetric basis logic functions. IEEE Trans Comput 70(1):128–138. https://doi.org/10.1109/tc.2020.2986970
Article Google Scholar
Yang T, Wei Y, Tu Z, Zeng H, Kinsy MA, Zheng N, Ren P (2019) Design space exploration of neural network activation function circuits. IEEE Trans Comput Aided Des Integr Circuits Syst 38(10):1974–1978. https://doi.org/10.1109/tcad.2018.2871198
Article Google Scholar
Elbrächter D, Perekrestenko D, Grohs P, Bölcskei H (2021) Deep neural network approximation theory. IEEE Trans Inf Theory 67(5):2581–2623. https://doi.org/10.1109/tit.2021.3062161
Article MathSciNet MATH Google Scholar
Khan AH, Cao X, Li S, Katsikis VN, Liao L (2020) BAS-ADAM: an ADAM based approach to improve the performance of beetle antennae search optimizer. IEEE/CAA J Autom Sin 7(2):461–471. https://doi.org/10.1109/jas.2020.1003048
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: IEEE international conference on computer vision (ICCV) pp 1026–1034. https://doi.org/10.1109/ICCV.2015.123
Tan M, Le QV (2019) EfficientNet: rethinking model scaling for convolutional neural networks. arXiv preprint arXiv:1905.11946v5. https://doi.org/10.48550/arXiv.1905.11946
Kandel I, Castelli M (2020) The effect of batch size on the generalizability of the convolutional neural networks on a histopathology dataset. ICT Express 6(4):312–315. https://doi.org/10.1016/j.icte.2020.04.010
Article Google Scholar
Gao Y, Liu Y, Zhang H, Li Z, Zhu Y, Lin H, Yang M (2020) Estimating GPU memory consumption of deep learning models. In: Proceedings of the 28th ACM joint meeting on European software engineering conference and symposium on the foundations of software engineering (ESEC/FSE). New York, NY, USA pp 1342–1352. https://doi.org/10.1145/3368089.3417050
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958. https://doi.org/10.5555/2627435.2670313
Article MathSciNet MATH Google Scholar
Arpit D, Campos V, Bengio Y (2019) How to initialize your network? Robust initialization for WeightNorm and ResNets. In: Proceedings of the 33rd international conference on neural information processing systems. Curran Associates Inc., Red Hook, NY, USA, Article 978, pp 10902–10911. https://doi.org/10.5555/3454287.3455265
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics, PMLR, 9:249–256. https://proceedings.mlr.press/v9/glorot10a/glorot10a.pdf
Saxe AM, McClelland JL, Ganguli S (2014) Exact solutions to the nonlinear dynamics of learning in deep linear neural networks. arXiv preprint arXiv:1312.6120v3. https://doi.org/10.48550/arXiv.1312.6120
Smith SL, Kindermans P-J, Ying C, Le QV (2018) Don’t decay the learning rate, increase the batch size. arXiv preprint arXiv:1711.00489v2. https://doi.org/10.48550/arXiv.1711.00489
Mustika IW, Adi HN, Najib F (2021) Comparison of Keras optimizers for earthquake signal classification based on deep neural networks. In: IEEE 4th international conference on information and communications technology (ICOIACT) pp 304–308. https://doi.org/10.1109/ICOIACT53268.2021.9563990
Duchi J, Hazan E, Singer Y (2011) Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res (JMLR) 12:2121–2159. https://doi.org/10.5555/1953048.2021068
Article MathSciNet MATH Google Scholar
Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980v9. https://doi.org/10.48550/arXiv.1412.6980
Dogo EM, Afolabi OJ, Nwulu NI, Twala B, Aigbavboa CO (2018) A comparative analysis of gradient descent-based optimization algorithms on convolutional neural networks. In: International conference on computational techniques, electronics and mechanical systems (CTEMS) pp 92–99. https://doi.org/10.1109/CTEMS.2018.8769211
Sutskever I, Martens J, Dahl G, Hinton G (2013) On the importance of initialization and momentum in deep learning. In: Proceedings of the 30th international conference on machine learning (ICML) 28:1139–1147. https://doi.org/10.5555/3042817.3043064
Xu D, Zhang S, Zhang H, Mandic DP (2021) Convergence of the RMSProp deep learning method with penalty for nonconvex optimization. Neural Netw 139:17–23. https://doi.org/10.1016/j.neunet.2021.02.011
Article MATH Google Scholar
Nusrat I, Jang S-B (2018) A comparison of regularization techniques in deep neural networks. Symmetry 10(11):648. https://doi.org/10.3390/sym10110648
Article Google Scholar
Lecun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324. https://doi.org/10.1109/5.726791
Article Google Scholar
CIFAR-10 and CIFAR-100 datasets (2022) https://www.cs.toronto.edu/~kriz/cifar.html. Accessed 29 June 2022
Netzer Y, Wang T, Coates A, Bissacco A, Wu B, Ng AY (2011) Reading digits in natural images with unsupervised feature learning. In: Neural information processing systems (NIPS) workshop on deep learning and unsupervised feature learning. http://ufldl.stanford.edu/housenumbers/nips2011_housenumbers.pdf
Farda NA, Lai J-Y, Wang J-C, Lee P-Y, Liu J-W, Hsieh I-H (2021) Sanders classification of calcaneal fractures in CT images with deep learning and differential data augmentation techniques. Injury 52(3):616–624. https://doi.org/10.1016/j.injury.2020.09.010
Article Google Scholar
Xiao H, Rasul K, Vollgraf R (2017) Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747v2. https://doi.org/10.48550/arXiv.1708.07747
Noel MM, Trivedi A, Dutta P (2023) Growing cosine unit: a novel oscillatory activation function that can speedup training and reduce parameters in convolutional neural networks. arXiv preprint arXiv:2108.12943v3. https://doi.org/10.48550/arXiv.2108.12943
Zhu M, Min W, Wang Q, Zou S, Chen X (2021) PFLU and FPFLU: two novel non-monotonic activation functions in convolutional neural networks. Neurocomputing 429:110–117. https://doi.org/10.1016/j.neucom.2020.11.068
Article Google Scholar
Mercioni MA, Holban S (2021) Soft-clipping swish: a novel activation function for deep learning. In: IEEE 15th international symposium on applied computational intelligence and informatics (SACI), Timisoara, Romania. https://doi.org/10.1109/SACI51354.2021.9465622
Liu X, Di X (2021) TanhExp: a smooth activation function with high convergence speed for lightweight neural networks. IET Comput Vision 15(2):136–150. https://doi.org/10.1049/cvi2.12020
Article Google Scholar

Download references

Author information

Authors and Affiliations

Computer Programming Program, Department of Computer Technologies, Vocational School of Technical Sciences, Harran University, Şanlıurfa, Turkey
Mustafa Kaytan
Department of Computer Engineering, Faculty of Engineering, Harran University, Şanlıurfa, Turkey
İbrahim Berkan Aydilek
Department of Computer Engineering, Faculty of Engineering and Natural Science, İskenderun Technical University, İskenderun, Hatay, Turkey
Celaleddin Yeroğlu

Authors

Mustafa Kaytan
View author publications
You can also search for this author in PubMed Google Scholar
İbrahim Berkan Aydilek
View author publications
You can also search for this author in PubMed Google Scholar
Celaleddin Yeroğlu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mustafa Kaytan.

Ethics declarations

Conflict of interest

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Kaytan, M., Aydilek, İ.B. & Yeroğlu, C. Gish: a novel activation function for image classification. Neural Comput & Applic 35, 24259–24281 (2023). https://doi.org/10.1007/s00521-023-09035-5

Download citation

Received: 22 September 2022
Accepted: 06 September 2023
Published: 27 September 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s00521-023-09035-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Gish: a novel activation function for image classification

Abstract

Access this article

Similar content being viewed by others

A Novel Activation Function in Convolutional Neural Network for Image Classification in Deep Learning

A Convolutional Neural Network Model Based on Improved Softplus Activation Function

αSechSig and αTanhSig: two novel non-monotonic activation functions

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Gish: a novel activation function for image classification

Abstract

Access this article

Similar content being viewed by others

A Novel Activation Function in Convolutional Neural Network for Image Classification in Deep Learning

A Convolutional Neural Network Model Based on Improved Softplus Activation Function

α­SechSig and α­TanhSig: two novel non-monotonic activation functions

Data availability

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation

αSechSig and αTanhSig: two novel non-monotonic activation functions