Abstract
Hyperspectral imaging is a spectroscopic imaging technique that can cover a broad range of electromagnetic wavelengths and subdivide those into spectral bands. As a consequence, it may distinguish specific features more effectively than conventional colour cameras. This technology has been increasingly used in agriculture for various applications such as crop leaf area index, plant classification and disease monitoring. However, the abundance of information in hyperspectral imagery may cause high dimensionality problem, leading to computational complexity and storage issues. Furthermore, data availability is another major issue. In agriculture application, typically, it is difficult to collect equal number of samples as some classes or diseases are rare while others are abundant and easy to collect. This may give rise to an imbalanced data problem that can severely reduce machine learning performance and introduce bias in performance measurement. In this paper, an oversampling method is proposed based on Safe-Level synthetic minority oversampling technique (Safe-Level SMOTE), which is modified in terms of its k-nearest neighbours (KNN) function to make it fit better with high dimensional data. Using convolutional neural networks (CNN) as the classifier combined with ensemble bagging with differentiated sampling rate (DSR), the approach demonstrates better performances than the other state-of-the-art methods in handling imbalance situations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alsuwaidi, A., Veys, C., Hussey, M., Grieve, B., Yin, H.: Hyperspectral feature selection ensemble for plant classification. Hyperspectral Imaging Appl. (HSI 2016) (2016)
Alsuwaidi, A., Grieve, B., Yin, H.: Feature-ensemble-based novelty detection for analyzing plant hyperspectral datasets. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 11(4), 1041–1055 (2018)
Sambasivam, G., Opiyo, G.D.: A predictive machine learning application in agriculture: Cassava disease detection and classification with imbalanced dataset using convolutional neural networks. Egypt. Inf. J. 22(1), 27–34 (2020)
Hussein, B.R., Malik, O.A., Ong, W.-H., Slik, J.W.F.: Automated classification of tropical plant species data based on machine learning techniques and leaf trait measurements. In: Alfred, R., Lim, Y., Haviluddin, H., On, C.K. (eds.) Computational Science and Technology. LNEE, vol. 603, pp. 85–94. Springer, Singapore (2020). https://doi.org/10.1007/978-981-15-0058-9_9
Divakar, S., Bhattacharjee, A., Priyadarshini, R.: Smote-DL: a deep learning based plant disease detection method. In: 6th International Conference for Convergence in Technology (I2CT) (2021)
Feng, W., Huang, W., Ye, H., Zhao, L.: Synthetic minority over-sampling technique based rotation forest for the classification of unbalanced hyperspectral data. In: International Geoscience and Remote Sensing Symposium (IGARSS), vol. 12(7), pp. 2159–2169 (2018)
Zhang, X., Song, Q., Zheng, Y., Hou, B., Gou, S.: Classification of imbalanced hyperspectral imagery data using support vector sampling. In: International Geoscience and Remote Sensing Symposium (IGARSS) (2014)
Li, C., Qu, X., Yang, Y., Yao, D., Gao, H., Hua, Z.: Composite clustering sampling strategy for multiscale spectral-spatial classification of hyperspectral images. J. Sens. 2020 (2020). Article ID 9637839, 17 pages. https://doi.org/10.1155/2020/9637839
Baumgardner, M.F., Biehl, L.L., Landgrebe, D.A.: 220 Band AVIRIS hyperspectral image data set: June 12, 1992 Indian pine test site 3. Purdue Univ. Res. Repos. (2015)
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16(1), 321–357 (2002)
Blagus, R., Lusa, L.: SMOTE for high-dimensional class-imbalanced data. BMC Bioinformatics 14(106), 1471–2105 (2013)
Batista, G.E.A.P.A., Prati, R.C., Monard, M.C.: A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explor. Newsl. 6(1), 20–29 (2004)
Han, H., Wang, Wen-Yuan., Mao, Bing-Huan.: Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In: Huang, De-Shuang., Zhang, Xiao-Ping., Huang, Guang-Bin. (eds.) ICIC 2005. LNCS, vol. 3644, pp. 878–887. Springer, Heidelberg (2005). https://doi.org/10.1007/11538059_91
Bunkhumpornpat, C., Sinapiromsaran, K., Lursinsap, C.: Safe-level-SMOTE: safe-level-synthetic minority over-sampling technique for handling the class imbalanced problem. In: Theeramunkong, Thanaruk, Kijsirikul, Boonserm, Cercone, Nick, Ho, Tu-Bao. (eds.) PAKDD 2009. LNCS (LNAI), vol. 5476, pp. 475–482. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-01307-2_43
Aggarwal, C.C., Hinneburg, A., Keim, D.A.: On the surprising behavior of distance metrics in high dimensional space. In: Van den Bussche, J., Vianu, V. (eds.) Database Theory — ICDT 2001. ICDT 2001. Lecture Notes in Computer Science, vol. 1973, pp. 420–434. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44503-X_27
Feng, W., Huang, W., Bao, W.: Imbalanced hyperspectral image classification with an adaptive ensemble method based on SMOTE and rotation forest with differentiated sampling rates. IEEE Geosci. Remote Sens. Lett. 16(12), 1879–1883 (2019)
Chicco, D., Jurman, G.: The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics. 21(1) (2020). Article ID 6. https://doi.org/10.1186/s12864-019-6413-7
Alotaibi, B., Alotaibi, M.: A hybrid deep ResNet and inception model for hyperspectral image classification. PFG – J. Photogram. Remote Sens. Geoinformation Sci. 88(6), 463–476 (2020). https://doi.org/10.1007/s41064-020-00124-x
Cai, L., Zhang, G.: Hyperspectral image classification with imbalanced data based on oversampling and convolutional neural network. In: AOPC: AI in Optics and Photonics (2019)
Li, J., Du, Q., Li, Y., Li, W.: Hyperspectral image classification with imbalanced data based on orthogonal complement subspace projection. IEEE Trans. Geosci. Remote Sens. 56(7), 3838–3851 (2018)
Acknowledgement
Tajul Miftahushudur would like to acknowledge the Scholarship provided by the Indonesian Endowment Fund for Education (LPDP).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Miftahushudur, T., Grieve, B., Yin, H. (2021). Ensemble Synthetic Oversampling with Manhattan Distance for Unbalanced Hyperspectral Data. In: Yin, H., et al. Intelligent Data Engineering and Automated Learning – IDEAL 2021. IDEAL 2021. Lecture Notes in Computer Science(), vol 13113. Springer, Cham. https://doi.org/10.1007/978-3-030-91608-4_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-91608-4_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-91607-7
Online ISBN: 978-3-030-91608-4
eBook Packages: Computer ScienceComputer Science (R0)