Abstract
Thunderstorms are natural disasters that impact people, animals, and the economy. Thunderstorms’ detrimental repercussions can be avoided by identifying their occurrence in advance. The current work, in this respect, uses soft computing techniques such as K-Nearest Neighbour (KNN), Decision Tree (DT), Logistic Regression (LR), and Support Vector Machine (SVM) with various kernel functions to categorize the occurrence of thunderstorms over Ranchi, India. These techniques were trained and tested using two data sets: daily average and hourly meteorological datasets. The primary purpose of this study is to find which dataset-classifier combination is optimal for categorizing thunderstorm occurrence in Ranchi. No classifier was found to adequately classify either the Day Average Dataset or the Modified Day Average Dataset. On the other hand, the Hourly Dataset was found to be more balanced in terms of the number of thunderstorms that occurred than the Day Average and Modified Average datasets. The F-Score value of the incidence of thunderstorm incidents after using different classifiers was used to compare the outcomes of these datasets. The results reveal that using SVM with radial basis function. The Hourly Dataset is the best for thunderstorm day classification. For the overall and only incidence of thunderstorms classes, SVM-RBF gets 0.81 and 0.74 F-Scores, respectively. Other approaches, like grid search and Bagging, have been used to increase SVM-RBF performance. Grid search and Bagging are used on SVM-RBF to produce a hybrid Grid-Bag-SVM-RBF classifier with 82.04% accuracy and F-scores of 0.83 and 0.78 for overall and just thunderstorm occurrence, respectively.





Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Litta, A.J., Idicula, S.M., Francis, C.N.: Artificial neural network for the prediction of thunderstorms over Kolkata. In. J. Comput. Appl. 50(11), 50–55 (2012)
Wilks, D.S.: International variability and extreme value characteristics of severe stochastics daily precipitation. Agric. For. Meteorol. 93, 153–169 (1999)
Litta, A.J., Indicula, S.M., Mohanty, U.C.: Artificial neural network model in prediction of meteorological parameters during pre-monsoon thunderstorms. Int. J. Atmos. Sci. 2013, 1–14 (2013)
Saha, U., Maitra, A., Midya, S.K., Das, G.K.: Association of thunderstorm frequency with rainfall occurrences over an Indian urban metropolis. Atmos. Res. 138, 240–252 (2014)
Chaudhuri, S.: Preferred type of cloud in the genesis of severe thunderstorms—a soft computing approach. Atmos. Res. 88(2), 149–156 (2008)
Webb. R., King, P.: Forecasting thunderstorm and severe thunderstorm using computer models. In: 15th Annual Workshop of Bureau of Meteorology Research Center (BMRC) Modelling Workshop, (2003).
Colquhoun, J.R.: A decision tree method of forecasting thunderstorms, severe thunderstorms, and tornadoes. Weather and Forecast. 2(4), 337–345 (1987)
Chaudhuri, S.: A Probe for Consistency in CAPE and CINE during the prevalence of severe thunderstorms: statistical-fuzzy coupled approach. Atmos. Clim. Sci. 4(1), 197–205 (2011)
Basak, P., Sarkar, D., Mukhopadhyay, A.K.: Estimation of thunderstorm days from the radio-sonde observations at Kolkata (22.530 N, 88.330 E), India during pre-monsoon season: an ANN based approach. Open Access E-J. Earth Sci. India 5(IV), 139–151 (2012)
Chakrabarty, H., Murthy, C.A., Gupta, D.A.: Application of pattern recognition techniques to predict severe thunderstorms. Int. J. Comput. Theor. Eng. 5(6), 850–855 (2013)
Putra, A. W., Lursinsap, C.: Cumulonimbus prediction using artificial neural network backpropagation with radiosonde indices. 153–165 (2014)
Cintineo, J. L., Pavolonis, M. J., Sieglaff, J. M., Lindsey, D. T.: Probabilistic nowcasting of severe convection. In: National Weather Association Annual Meeting, Madison, WI, Seminar Nasional Penginderaan Jauh F18.1, (2012).
Ping, L., Tao-rong, Q., Yu-yuan, L.: The Study on the model of thunderstorm forecast based on RS-SVM. J. Converg. Inf. Technol. 8(10), 66–74 (2013)
Bala, K., Choubey, D.K., Paul, S.: Soft computing and data mining techniques for thunderstorms and lightning prediction: a survey. In: International Conference on Electronics and Aerospace Technology (ICECA) Coimbatore, IEEE, pp. 42–46 (2017).
Choubey, D.K., Paul, S.: GA_MLP NN: a hybrid intelligent system for diabetes disease diagnosis. Int. J. Intell. Syst. Appl. (IJISA) MECS. 8, 49–59 (2016)
Choubey, D.K., Paul, S.: GA_RBF NN: a classification system for diabetes. Int. J. Biomed. Eng. Technol. (IJBET), Indersci. 23(1), 71–93 (2017)
Choubey, D.K., Paul, S.: GA_SVM—a classification system for diagnosis of diabetes. In: Handbook of research on nature inspired soft computing and algorithms, pp. 359–397. IGI Global, Hershey (2017)
Chatterjee, D., Chakrabarty, H.: Application of machine learning technique to predict severe thunderstorms using upper air data. Int. J. Sci. Eng. Res. 6(7), 1527–1530 (2015)
Chakrabarty, H., Bhattacharya, S.: Prediction of severe thunderstorms applying neural network using RSRW data. Int. J. Comput. Appl. 89(16), 1–5 (2014)
Fix, E., Hodges, J.L., Jr.: Discriminatory analysis-nonparametric discrimination: consistency properties. In: Technical report. California University, Berkeley (1951)
Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Information. Theory 13(1), 21–27 (1967)
Kataria, A., Singh, M.D.: A review of data classification using K-nearest neighbor algorithm. Int. J. Emerg. Technol. Adv. Eng. 3(6), 354–360 (2013)
Wu, X., Kumar, V., Quinlan, J.R., Ghosh, J., Yang, Q., Motoda, H.: Top 10 algorithms in data mining. Knowl. Inf. Syst. 14(1), 1–37 (2008)
Bhatia, N., Vandana: Survey of nearest neighbor techniques. Int. J. Comput. Sci. Inf. Secur. (IJCSIS) 8(2), 302–305 (2010)
Yang, Y., Ault, T., Pierce, T., Lattimer, C. W.: Improving text categorization methods for event tracking. In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 65–72, (2000).
Xiubo, G., Tie-Yan, L., Qin, T., Andrew, A., Li, H., Shum, H. Y.: Query dependent ranking using k-nearest neighbor. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 115–122 (2008).
Xu, S., Wu, Y.: An algorithm for remote sensing image classification based on artificial immune B-cell network. The Int. Arch. Photogram. Remote Sensing Spat. Inf. Sci. 37, 107–112 (2008)
Song, Y.Y., Lu, Y.: Decision tree method: application for classification and prediction. Shanghai Arch. Psychiatry 27(2), 130–135 (2015)
Dasgupta, S., De, U.K.: A logistic regression model for prediction of pre-monsoon convective development over Kolkata. Indian J. Radio Space Phys. 33, 251–255 (2004)
Vapnik, V.N.: The nature of statistical learning theory. Springer-Vargal New York, New York, NY (1995). https://doi.org/10.1007/978-1-4757-2440-0
Syarif, W., Prugel-Bennett, A., Wills, G.: SVM parameter optimization using grid search and genetic algorithm to improve classification performance. TELKOMNIKA 14(4), 1502–1509 (2016)
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)
Bajramovic, F., Mattern, F., Butko, N., Denzler, J.: A comparison of nearest neighbor search algorithms for generic object recognition. Adv. Concepts Intell. Vision Syst. Springer, (LNCS) 4179, 1186–1197 (2006)
Ara, A., Maia, M., Louzada, F., Macêdo, S.: Random machines: a bagged-weighted support vector model with free kernel choice. J. Data Sci. (2021). https://doi.org/10.6339/21-JDS1014
Longadge, R., Dongre, S. S., Malik, L.: Class imbalance problem in data mining: review. Int. J. Comput. Sci. Netw. (IJCSN) 2(1) (2013). https://doi.org/10.48550/arXiv.1305.1707
Abdellatif, S., Hassine, M. A. B., Yahia, S. B., Bouzeghoub, A.: ARCID: a new approach to deal with imbalanced datasets classification. In: SOFSEM 2018: theory and practice of computer science. SOFSEM 2018. Lecture Notes in Computer Science vol. 10706, (2018).
Li, Y., Li, H., Li, X., Xie, P.: On deep learning models for detection of thunderstorm Gale. J. Internet Technol. 21(4) (2020). https://doi.org/10.3966/160792642020072104001
Azad, A. K., Reza, A., Islam, M. T., Rahman, M. S., Ayen, K.: Development of novel hybrid machine learning models for monthly thunderstorm frequency prediction over Bangladesh. Nat. Hazards-springer 108 (1) , 1109−1135, 2021. https://doi.org/10.1007/s11069-021-04722-9
Kamangir, H., Collins, W., Tissot, P., King, S. A.: A deep-learning model to predict thunderstorms within 400 km2 South Texas domains Meteorol. Appl. 27(2) 2020https://doi.org/10.1002/met.1905
Zhou, K., Zheng, Y., Li, B., Dong, W., Zhang, X.: Forecasting different types of convective weather: a deep learning approach. J. Meteorol.l Res. 33, 797–809 (2019)
Bala, K., Paul, S., Ghosh, M.: Heuristic model to compute indices for classification of incidence of thunderstorms over ranchi with atmospheric parameter. IEEE Access 9, 127086–127101 (2021)
Zhang, X., Mohanty, S.N., Parida, A.K., Pani, S.K.: Annual and on-moonsoon rainfall prediction modeling using SVR-MLP: an empirical study from Odisha. IEEE Access 8(1), 30223–30233 (2020). https://doi.org/10.1109/ACCESS.2020.2972435
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Bala, K., Paul, S., Mohanty, S.N. et al. Improved Prediction Analysis with Hybrid Models for Thunderstorm Classification over the Ranchi Region. New Gener. Comput. 42, 7–31 (2024). https://doi.org/10.1007/s00354-022-00174-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00354-022-00174-2