Abstract
Agricultural activities in India are heavily reliant on the monsoon rainfall during July–September every year. Indian Meteorological Department has been issuing rainfall forecasts since 1886. These predictions at a country or broad region level have limited benefits since different areas may see wide variations even when the overall average for India remains stable. This study explored possibilities of creating a cluster of districts as a more granular yet cohesive unit for rainfall forecast, by using different weather and atmospheric variables for past 12 months. Analytically, Principal Component Analysis (PCA) was used to reduce data dimensionality before creating an optimal cluster solution. Subsequently, a set of cluster-level linear regression models was found to perform better than a single regression model based on the entire sample. While district-level predictions showed limited value, the sequential combination of unsupervised and supervised techniques showed promising results at an overall level. These results will serve as a strong baseline for the planned extension of this pilot study which will use advanced machine learning techniques to improve upon the prediction performance further.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Defined as at least 10 days ahead as per UK Meteorological Office definition.
- 2.
+1 was used to tackle the 0 values.
- 3.
- 4.
- 5.
Long Period Average (LPA) is defined as the average actual monsoon rainfall for 1951–2000.
References
Dunne, T.: Stochastic aspects of the relations between climate, hydrology and landform evolution. Trans. Jpn Geomorphol. Union 12(1), 1–24 (1991)
Comrey, A.L., Lee, H.B.: A First Course in Factor Analysis. Lawrence Eribaum Associates Inc., Hillsdale (1992)
Omotosho, J.B., Balogun, A.A., Ogunjobi, K.: Predicting monthly and seasonal rainfall, onset and cessation of the rainy season in West Africa using only surface data. Int. J. Climatol. 20(8), 865–880 (2000)
Zaïane, O.R., Foss, A., Lee, C.-H., Wang, W.: On data clustering analysis: scalability, constraints, and validation. In: Chen, M.-S., Yu, P.S., Liu, B. (eds.) PAKDD 2002. LNCS, vol. 2336, pp. 28–39. Springer, Heidelberg (2002). doi:10.1007/3-540-47887-6_4
Sen, N.: New forecast models for Indian south-west monsoon season rainfall. Curr. Sci. 84(10), 1290–1291 (2003)
Stenseth, N.C., Ottersen, G., Hurrell, J.W., Mysterud, A., Lima, M., Chan, K.S., et al.: Studying climate effects on ecology through the use of climate indices: the North Atlantic Oscillation, El Nino Southern Oscillation and beyond. Proc. R. Soc. Lond. B Biol. Sci. 270(1529), 2087–2096 (2003)
Rajeevan, M., Pai, D.S., Dikshit, S.K., Kelkar, R.R.: IMD’s new operational models for long-range forecast of southwest monsoon rainfall over India and their verification for 2003. Curr. Sci. 86(3), 422–431 (2004)
Kim, M., Ramakrishna, R.S.: New indices for cluster validity assessment. Pattern Recogn. Lett. 26(15), 2353–2363 (2005)
Rokach, L., Maimon, O.: Clustering methods. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 321–352. Springer, Boston (2005). doi:10.1007/0-387-25465-X_15
Aksoy, E.: Clustering with GIS: an attempt to classify turkish district data. In: XXIII FIG Congress, pp. 8–13, November 2006
Tripathi, S., Srinivas, V.V., Nanjundiah, R.S.: Downscaling of precipitation for climate change scenarios: a support vector machine approach. J. Hydrol. 330(3), 621–640 (2006)
Bottman, N., Essig, W., Whittle, S.: Why weight? A cluster-theoretic approach to political districting. In: MCM 2007, Department of Mathematics, University of Washington (2007)
Rajeevan, M., Pai, D.S., Kumar, R.A., Lal, B.: New statistical models for long-range forecasting of southwest monsoon rainfall over India. Clim. Dyn. 28(7–8), 813–828 (2007)
Ingsrisawang, L., Ingsriswang, S., Luenam, P., Trisaranuwatana, P., Klinpratoom, S., Aungsuratana, P., Khantiyanan, W.: Applications of statistical methods for rainfall prediction over the Eastern Thailand. In: Proceedings of the International MultiConference of Engineers and Computer Scientists, vol. 3 (2010)
Bello-Orgaz, G., Menéndez, H.D., Camacho, D.: Adaptive k-means algorithm for overlapped graph clustering. Int. J. Neural Syst. 22(05), 1250018 (2012)
Kisi, O., Cimen, M.: Precipitation forecasting by using wavelet-support vector machine conjunction model. Eng. Appl. Artif. Intell. 25(4), 783–792 (2012)
Kumar, A., Pai, D.S., Singh, J.V., Singh, R., Sikka, D.R.: Statistical models for long-range forecasting of southwest monsoon rainfall over India using step wise regression and neural network. Atmos. Clim. Sci. 2(03), 322 (2012)
Ansari, H.: Forecasting seasonal and annual rainfall based on nonlinear modeling with Gamma test in North of Iran. Int. J. Eng. Pract. Res. 2(1), 16–29 (2013)
Rao, M.V.V., Kumar, S., Brahmam, G.N.V.: A study of the geographical clustering of districts in Uttar Pradesh using nutritional anthropometric data of preschool children. Indian J. Med. Res. 137(1), 73 (2013)
Yong, A.G., Pearce, S.: A beginner’s guide to factor analysis: focusing on exploratory factor analysis. Tutor. Quant. Methods Psychol. 9(2), 79–94 (2013)
Chifurira, R., Chikobvu, D.: A weighted multiple regression model to predict rainfall patterns: principal component analysis approach. Mediter. J. Soc. Sci. 5(7), 34 (2014)
Menéndez, H.D., Barrero, D.F., Camacho, D.: A genetic graph-based approach for partitional clustering. Int. J. Neural Syst. 24(03), 1430008 (2014)
Menéndez, H.D., Otero, F.E., Camacho, D.: Medoid-based clustering using ant colony optimization. Swarm Intell. 10(2), 123–145 (2016)
Acknowledgement
This work was undertaken as part of the Master of Technology in Enterprise Business Analytics program in Institute of Systems Science, National University of Singapore, under the guidance of Dr. Rita Chakravarti. Authors would like to thank Dr. Chakravarti for her guidance and the two anonymous reviewers for their valuable suggestions on this paper.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Deb, S., Acebedo, C.M.L., Yu, J., Dhanapal, G., Periasamy, N. (2017). Analysis of District-Level Monsoon Rainfall Patterns in India: A Pilot Study. In: Phon-Amnuaisuk, S., Ang, SP., Lee, SY. (eds) Multi-disciplinary Trends in Artificial Intelligence. MIWAI 2017. Lecture Notes in Computer Science(), vol 10607. Springer, Cham. https://doi.org/10.1007/978-3-319-69456-6_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-69456-6_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-69455-9
Online ISBN: 978-3-319-69456-6
eBook Packages: Computer ScienceComputer Science (R0)