Skip to main content
Log in

EGRNN++ and PNN++ : Parallel and Distributed Neural Networks for Big Data Regression and Classification

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

Probabilistic Neural Network (PNN) and General Regression Neural Network (GRNN) are unique as they are trained in one pass and perform well for classification and regression problems, respectively; however, they cannot handle big data sets, because the pattern layer in them is required to store all the training samples. Therefore, this paper proposes hybrid architectures for PNN and GRNN, where the pattern layer is made simpler by storing cluster centers of all the samples, thereby making them amenable for big data analytics. This paper proposes the use of parallel distributed clustering algorithms viz., K-Means|| and parallel Bisecting K-Means before the pattern layer of GRNN and PNN. We proposed the use of logistic and Cauchy activation functions in the place of Gaussian activation function for both networks. The effectiveness of the variants of EGRNN++ was tested on two data sets taken from Chemistry and Amazon Movie Review dataset for customer review rating prediction under a ten-fold cross-validation (10-FCV) setup. Similarly, the performance of the variants of PNN++ was measured with HEPMASS, HIGGS, ccFraud and Amazon Movie Review dataset for sentiment classification under a 10-FCV setup. The proposed variants of EGRNN++ produced very low Mean Squared Error (MSE). The proposed variants of PNN++ produced high AUC. We have conducted Wilcoxon Signed-Rank test at 1% level of significance. It is worthwhile to mention that the primary objective of this article is to present a distributed and parallel version of the traditional GRNN and PNN to handle big data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Broomhead DS, Lowem D. Radial basis functions, multi-variable functional interpolation and adaptive networks. (1988). https://apps.dtic.mil/docs/citations/ADA196234. Accessed 15 Feb 2019.

  2. Zhang Q, Benveniste A. Wavelet networks. IEEE Trans Neural Netw. 1992;3:889–98. https://doi.org/10.1109/72.165591.

    Article  Google Scholar 

  3. Specht DF. A general regression neural network. IEEE Trans Neural Netw. 1991;2:568–76. https://doi.org/10.1109/72.97934.

    Article  Google Scholar 

  4. Ivakhnenko AG. The group method of data of handling. A rival of the method of stochastic approximation. Sov Autom Control. 1968;13:43–55.

    Google Scholar 

  5. Specht DF. Probabilistic neural networks. Neural Netw. 1990;3:109–18. https://doi.org/10.1016/0893-6080(90)90049-Q.

    Article  Google Scholar 

  6. Kusakunniran W, Wu Q, Zhang J, Li H. Multi-view gait recognition based on motion regression using multilayer perceptron. In: 2010 20th International conference pattern recognition, IEEE, Istanbul, Turkey, 2010: pp 2186–2189. https://doi.org/10.1109/ICPR.2010.535

  7. Agirre-Basurko E, Ibarra-Berastegi G, Madariaga I. Regression and multilayer perceptron-based models to forecast hourly O3 and NO2 levels in the Bilbao area. Environ Model Softw. 2006;21:430–46. https://doi.org/10.1016/J.ENVSOFT.2004.07.008.

    Article  Google Scholar 

  8. Gaudart J, Giusiano B, Huiart L. Comparison of the performance of multi-layer perceptron and linear regression for epidemiological data. Comput Stat Data Anal. 2004;44:547–70. https://doi.org/10.1016/S0167-9473(02)00257-8.

    Article  MathSciNet  MATH  Google Scholar 

  9. Mignon A, Jurie F. Reconstructing faces from their signatures using RBF regression. In: Proceedings of British Machine Vision conference 2013, British Machine Vision Association, Bristol, UK, 2013, pp. 103.1–103.11. https://doi.org/10.5244/C.27.103.

  10. Hannan SA, Manza RR, Ramteke RJ. Generalized regression neural network and radial basis function for heart disease diagnosis. Int J Comput Appl. 2010;7:7–13.

    Google Scholar 

  11. Taki M, Rohani A, Soheili-Fard F, Abdeshahi A. Assessment of energy consumption and modeling of output energy for wheat production by neural network (MLP and RBF) and Gaussian process regression (GPR) models. J Clean Prod. 2018;172:3028–41. https://doi.org/10.1016/J.JCLEPRO.2017.11.107.

    Article  Google Scholar 

  12. Lin D-T. Facial expression classification using PCA and hierarchical radial basis function network. J Inf Sci Eng. 2006;22:1033–46.

    Google Scholar 

  13. Radha V, Nallammal N. Neural network based face recognition using RBFN classifier. In: Proceedings of the world congress on engineering and computer science, San Francisco, USA, 2011, pp. 19–21.

  14. Budu K. Comparison of Wavelet-based ANN and regression models for reservoir inflow forecasting. J Hydrol Eng. 2014;19:1385–400. https://doi.org/10.1061/(ASCE)HE.1943-5584.0000892.

    Article  Google Scholar 

  15. Vinaykumar K, Ravi V, Carr M, Rajkiran N. Software development cost estimation using wavelet neural networks. J Syst Softw. 2008;81:1853–67. https://doi.org/10.1016/J.JSS.2007.12.793.

    Article  Google Scholar 

  16. Chauhan N, Ravi V, Chandra DK. Differential evolution trained wavelet neural networks: application to bankruptcy prediction in banks. Expert Syst Appl. 2009;36:7659–65. https://doi.org/10.1016/J.ESWA.2008.09.019.

    Article  Google Scholar 

  17. Rajkiran N, Ravi V. Software reliability prediction using wavelet neural networks. In: International conference on computational intelligence and multimedia applications (ICCIMA 2007), IEEE, Sivakasi, Tamil Nadu, India, 2007, pp. 195–199. https://doi.org/10.1109/ICCIMA.2007.104.

  18. Subasi A, Yilmaz M, Ozcalik HR. Classification of EMG signals using wavelet neural network. J Neurosci Methods. 2006;156:360–7. https://doi.org/10.1016/J.JNEUMETH.2006.03.004.

    Article  Google Scholar 

  19. Geethanjali M, Priya KS. Combined wavelet transfoms and neural network (WNN) based fault detection and classification in transmission lines. In: 2009 International conference on control, automation, communication and energy conservation, Perundurai, Tamilnadu, India, 2009, pp. 1–7. https://ieeexplore.ieee.org/abstract/document/5204419. Accessed 25 Mar 2019.

  20. Astakhov VP, Galitsky VV. Tool life testing in gundrilling: an application of the group method of data handling (GMDH). Int J Mach Tools Manuf. 2005;45:509–17. https://doi.org/10.1016/J.IJMACHTOOLS.2004.09.003.

    Article  Google Scholar 

  21. Elattar EE, Goulermas JY, Wu QH. Generalized locally weighted GMDH for short term load forecasting. IEEE Trans Syst Man Cybern Part C Appl Rev. 2012;42:345–56. https://doi.org/10.1109/TSMCC.2011.2109378.

    Article  Google Scholar 

  22. Srinivasan D. Energy demand prediction using GMDH networks. Neurocomputing. 2008;72:625–9. https://doi.org/10.1016/J.NEUCOM.2008.08.006.

    Article  Google Scholar 

  23. Ravisankar P, Ravi V. Financial distress prediction in banks using group method of data handling neural network, counter propagation neural network and fuzzy ARTMAP. Knowl Based Syst. 2010;23:823–31. https://doi.org/10.1016/J.KNOSYS.2010.05.007.

    Article  Google Scholar 

  24. Mohanty R, Ravi V, Patra MR. Software reliability prediction using group method of data handling. In: Sakai H, Chakraborty MK, Hassanien AE, Ślęzak D, Zhu W (eds.) International workshop on rough sets, fuzzy sets, data mining, and granular-soft computing: RSFDGrC 2009, Springer, Berlin2009: pp. 344–351. https://doi.org/10.1007/978-3-642-10646-0_42.

  25. Reddy KN, Ravi V. Kernel Group method of data handling: application to regression problems. In: Panigrahi BK, Das S, Suganthan PN, Nanda PK (eds.) Swarm, evolutionary, and memetic computing: SEMCCO 2012, Springer, Berlin, 2012: pp. 74–81. https://doi.org/10.1007/978-3-642-35380-2_10.

  26. Baig ZA, Sait SM, Shaheen A. GMDH-based networks for intelligent intrusion detection. Eng Appl Artif Intell. 2013;26:1731–40. https://doi.org/10.1016/J.ENGAPPAI.2013.03.008.

    Article  Google Scholar 

  27. El-Alfy E-SM, Abdel-Aal RE. Using GMDH-based networks for improved spam detection and email feature analysis. Appl Soft Comput. 2011;11:477–88. https://doi.org/10.1016/J.ASOC.2009.12.007.

    Article  Google Scholar 

  28. Gautam C, Ravi V. Counter propagation auto-associative neural network based data imputation. Inf Sci (Ny). 2015;325:288–99. https://doi.org/10.1016/J.INS.2015.07.016.

    Article  Google Scholar 

  29. Hecht-Nielsen R. Applications of counterpropagation networks. Neural Netw. 1988;1:131–9. https://doi.org/10.1016/0893-6080(88)90015-9.

    Article  Google Scholar 

  30. Ahad N, Qadir J, Ahsan N. Neural networks in wireless networks: techniques, applications and guidelines. J Netw Comput Appl. 2016;68:1–27. https://doi.org/10.1016/J.JNCA.2016.04.006.

    Article  Google Scholar 

  31. Jin L, Li S, Yu J, He J. Robot manipulator control using neural networks: a survey. Neurocomputing. 2018;285:23–34. https://doi.org/10.1016/J.NEUCOM.2018.01.002.

    Article  Google Scholar 

  32. Marugán AP, Márquez FPG, Perez JMP, Ruiz-Hernández D. A survey of artificial neural network in wind energy systems. Appl Energy. 2018;228:1822–36. https://doi.org/10.1016/J.APENERGY.2018.07.084.

    Article  Google Scholar 

  33. Agrawal S, Agrawal J. Neural network techniques for cancer prediction: a survey. Proc Comput Sci. 2015;60:769–74. https://doi.org/10.1016/J.PROCS.2015.08.234.

    Article  Google Scholar 

  34. Khoshroo A, Emrouznejad A, Ghaffarizadeh A, Kasraei M, Omid M. Sensitivity analysis of energy inputs in crop production using artificial neural networks. J Clean Prod Part. 2018;197:992–8. https://doi.org/10.1016/J.JCLEPRO.2018.05.249.

    Article  Google Scholar 

  35. Tkáč M, Verner R. Artificial neural networks in business: two decades of research. Appl Soft Comput. 2016;38:788–804. https://doi.org/10.1016/J.ASOC.2015.09.040.

    Article  Google Scholar 

  36. Kamaruddin S, Ravi V. GRNN++: A parallel and distributed version of GRNN under apache spark for big data regression. In: Sharma N, Chakrabarti A, Balas V, eds. Data Management, Analytics and Innovation. Advances in Intelligent Systems and Computing. Vol 1042. Kuala Lumpur, Malaysia: Springer, Singapore; 2020; pp. 215–227. https://doi.org/10.1007/978-981-32-9949-8_16.

  37. Dhariyal B, Ravi V, Ravi K. Sentiment analysis via Doc2Vec and convolutional neural network hybrids—IEEE conference publication. In: 2018 IEEE Symposium series on computational intelligence, IEEE, Bangalore, India, 2018. https://doi.org/10.1109/SSCI.2018.8628833.

  38. Bahmani B, Moseley B, Vattani A, Kumar R, Vassilvitskii S. Scalable K-means++. Proc VLDB Endow. 2012;5:622–33. https://doi.org/10.14778/2180912.2180915.

    Article  Google Scholar 

  39. Arthur D, Vassilvitskii S. k-Means++: the advantages of careful seeding. In: Proceedings of the eighteenth annual ACM-SIAM symposium on discrete algorithms, 2007, pp. 1027–1035.

  40. Steinbach M, Karypis G, Kumar V. A comparison of document clustering techniques. In: KDD workshop on text mining, 2000, pp. 525–526.

  41. Zhao W, Ma H, He Q. Parallel K-means clustering based on MapReduce. In: Jaatun CRMG, Zhao G, editors. Cloud computing. Berlin: Springer; 2009. p. 674–9.

    Chapter  Google Scholar 

  42. Liao Q, Yang F, Zhao J. An improved parallel K-means clustering algorithm with MapReduce. In: 2013 15th IEEE International conference on communication technology, IEEE, 2013: pp. 764–768. https://doi.org/10.1109/ICCT.2013.6820477.

  43. Kamaruddin S, Ravi V, Mayank P. Parallel evolving clustering method for big data analytics using apache spark: applications to banking and physics. In: Reddy P, Sureka A, Chakravarthy S, Bhalla S (eds.) Lecture notes on computer science (including subseries on lecture notes on artificial intelligence, lecture notes on bioinformatics), Springer, Cham, 2017: pp. 278–292. https://doi.org/10.1007/978-3-319-72413-3_19.

  44. Leung MT, Chen A-S, Daouk H. Forecasting exchange rates using general regression neural networks. Comput Oper Res. 2000;27:1093–110. https://doi.org/10.1016/S0305-0548(99)00144-6.

    Article  MATH  Google Scholar 

  45. Kayaer K, Yildirim T. Medical diagnosis on Pima Indian diabetes using general regression neural networks. In: Proceedings of the international conference on artificial neural networks and neural information processing (ICANN/ICONIP), 2003, pp. 181–184.

  46. Li C, Bovik AC, Wu X. Blind image quality assessment using a general regression neural network. IEEE Trans Neural Netw. 2011;22:793–9. https://doi.org/10.1109/TNN.2011.2120620.

    Article  Google Scholar 

  47. Li H, Guo S, Li C, Sun J. A hybrid annual power load forecasting model based on generalized regression neural network with fruit fly optimization algorithm. Knowl Based Syst. 2013;37:378–87. https://doi.org/10.1016/J.KNOSYS.2012.08.015.

    Article  Google Scholar 

  48. Pradeepkumar D, Ravi V. Forecasting financial time series volatility using Particle Swarm Optimization trained Quantile Regression Neural Network. Appl Soft Comput. 2017;58:35–52. https://doi.org/10.1016/J.ASOC.2017.04.014.

    Article  Google Scholar 

  49. Kiran NR, Ravi V. Software reliability prediction by soft computing techniques. J Syst Softw. 2008;81:576–83. https://doi.org/10.1016/J.JSS.2007.05.005.

    Article  Google Scholar 

  50. Kamini V, Ravi V, Kumar DN. Chaotic time series analysis with neural networks to forecast cash demand in ATMs. In: 2014 IEEE International conference on computational intelligence and computing research, IEEE, Coimbatore, India, 2014, pp. 1–5. https://doi.org/10.1109/ICCIC.2014.7238399.

  51. Ravi V, Pradeepkumar D, Deb K. Financial time series prediction using hybrids of chaos theory, multi-layer perceptron and multi-objective evolutionary algorithms. Swarm Evol Comput. 2017;36:136–49. https://doi.org/10.1016/J.SWEVO.2017.05.003.

    Article  Google Scholar 

  52. Othman MF, Basri MAM. Probabilistic neural network for brain tumor classification. In: 2011 Second international conference on intelligent systems, modelling and simulation, IEEE, Kuala Lumpur, Malaysia, 2011, pp. 136–138. https://doi.org/10.1109/ISMS.2011.32.

  53. Virmani J, Dey N, Kumar V. PCA-PNN and PCA-SVM based CAD systems for breast density classification. In: Hassanien A, Grosan C, Tolba MF, editors. Applications of intelligent optimization in biology and medicine. Cham: Springer; 2016. p. 159–80.

    Google Scholar 

  54. Sweeney WP, Musavi MT, Guidi JN. Classification of chromosomes using a probabilistic neural network. Cytometry. 1994;16:17–24. https://doi.org/10.1002/cyto.990160104.

    Article  Google Scholar 

  55. Lozano J, Fernández MJ, Fontecha JL, Aleixandre M, Santos JP, Sayago I, Arroyo T, Cabellos JM, Gutiérrez FJ, Horrillo MC. Wine classification with a zinc oxide SAW sensor array. Sens Actuators B Chem. 2006;120:166–71. https://doi.org/10.1016/J.SNB.2006.02.014.

    Article  Google Scholar 

  56. Mo F, Kinsner W. Probabilistic neural networks for power line fault classification. In: Conference proceedings. IEEE Canadian conference on electrical and computer engineering (Cat. No. 98TH8341), IEEE, Waterloo, Ontario, Canada, 1998, pp. 585–588. https://doi.org/10.1109/CCECE.1998.685564.

  57. Nishanth KJ, Ravi V. Probabilistic neural network based categorical data imputation. Neurocomputing. 2016;218:17–25. https://doi.org/10.1016/J.NEUCOM.2016.08.044.

    Article  Google Scholar 

  58. Ravisankar P, Ravi V, Rao GR, Bose I. Detection of financial statement fraud and feature selection using data mining techniques. Decis Support Syst. 2011;50:491–500. https://doi.org/10.1016/J.DSS.2010.11.006.

    Article  Google Scholar 

  59. Mohanty R, Ravi V, Patra MR. Web-services classification using intelligent techniques. Expert Syst Appl. 2010;37:5484–90. https://doi.org/10.1016/J.ESWA.2010.02.063.

    Article  Google Scholar 

  60. Ravi V, Kurniawan H, Thai PNK, Kumar PR. Soft computing system for bank performance prediction. Appl Soft Comput. 2008;8:305–15. https://doi.org/10.1016/J.ASOC.2007.02.001.

    Article  Google Scholar 

  61. Ravisankar P, Ravi V, Bose I. Failure prediction of dotcom companies using neural network–genetic programming hybrids. Inf Sci (Ny). 2010;180:1257–67. https://doi.org/10.1016/J.INS.2009.12.022.

    Article  Google Scholar 

  62. Nishanth KJ, Ravi V, Ankaiah N, Bose I. Soft computing based imputation and hybrid data and text mining: the case of predicting the severity of phishing alerts. Expert Syst Appl. 2012;39:10583–9. https://doi.org/10.1016/J.ESWA.2012.02.138.

    Article  Google Scholar 

  63. Sundarkumar GG, Ravi V. A novel hybrid undersampling method for mining unbalanced datasets in banking and insurance. Eng Appl Artif Intell. 2015;37:368–77. https://doi.org/10.1016/J.ENGAPPAI.2014.09.019.

    Article  Google Scholar 

  64. Ghosh R, Ravi K, Ravi V. A novel deep learning architecture for sentiment classification. In: 2016 3rd International conference on recent advances in information technology (RAIT), IEEE, Dhanbad, India, 2016, pp. 511–516. https://doi.org/10.1109/RAIT.2016.7507953.

  65. Ravi V, Krishna M. A new online data imputation method based on general regression auto associative neural network. Neurocomputing. 2014;138:106–13. https://doi.org/10.1016/J.NEUCOM.2014.02.037.

    Article  Google Scholar 

  66. Tejasviram V, Solanki H, Ravi V, Kamaruddin S. Auto associative extreme learning machine based non-linear principal component regression for big data applications. In: 2015 Tenth international conference on digital information management (ICDIM), IEEE, Jeju, South Korea, 2015, pp. 223–228. https://doi.org/10.1109/ICDIM.2015.7381854.

  67. Kamaruddin S, Ravi V. Credit card fraud detection using big data analytics: use of PSOAANN based one-class classification. In: Proceedings of international conference on informatics and analytics—ICIA-16, ACM Press, Pondicherry, India, 2016: pp. 1–8. https://doi.org/10.1145/2980258.2980319.

  68. Patil RR, Khan A. Bisecting K-means for clustering web log data. Int J Comput Appl. 2015;116:36–41.

    Google Scholar 

  69. Clustering—RDD-based API—Spark 2.2.0 Documentation. https://spark.apache.org/docs/2.2.0/mllib-clustering.html#bisecting-k-means. Accessed 14 Feb 2019.

  70. Dunn JC. A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J Cybern. 1973;3:32–57. https://doi.org/10.1080/01969727308546046.

    Article  MathSciNet  MATH  Google Scholar 

  71. Bezdek JC, Pal NR. Some new indices of cluster validity. IEEE Trans Syst Man Cybern B Cybern. 1998;28:301–15. https://doi.org/10.1109/3477.678624.

    Article  Google Scholar 

  72. Fonollosa J, Sheik S, Huerta R, Marco S. Reservoir computing compensates slow response of chemosensor arrays exposed to fast varying gas concentrations in continuous monitoring. Sens Actuators B Chem. 2015;215:618–29. https://doi.org/10.1016/J.SNB.2015.03.028.

    Article  Google Scholar 

  73. Gas sensor array under dynamic gas mixtures dataset. (2015). https://archive.ics.uci.edu/ml/datasets/Gas+sensor+array+under+dynamic+gas+mixtures. Accessed 2 July 2018.

  74. McAuley JJ, Leskovec J. From amateurs to connoisseurs: modeling the evolution of user expertise through online reviews. In: Proceedings of 22nd international conference on world wide web, 2013, pp. 897–908.

  75. SNAP: Web data: Amazon movie reviews. https://snap.stanford.edu/data/web-Movies.html. Accessed 10 Nov 2018.

  76. UCI Machine Learning Repository: HEPMASS Dataset. http://archive.ics.uci.edu/ml/datasets/hepmass. Accessed 15 Feb 2019.

  77. UCI Machine Learning Repository: HIGGS Dataset. http://archive.ics.uci.edu/ml/datasets/HIGGS. Accessed 15 Feb 2019.

  78. ccFraud Dataset. https://packages.revolutionanalytics.com/datasets/. Accessed 31 July 2018.

  79. Ravi K, Ravi V, Shivakrishna B. Sentiment classification using paragraph vector and cognitive big data semantics on apache spark. In: 2018 IEEE 17th International conference on cognitive informatics and cognitive computing, IEEE, Berkeley, CA, USA, 2018, pp. 187–194. https://doi.org/10.1109/ICCI-CC.2018.8482085.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vadlamani Ravi.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kamaruddin, S., Ravi, V. EGRNN++ and PNN++ : Parallel and Distributed Neural Networks for Big Data Regression and Classification. SN COMPUT. SCI. 2, 109 (2021). https://doi.org/10.1007/s42979-021-00504-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-021-00504-z

Keywords

Navigation