Abstract
Extreme learning Machine (ELM) is a novel supervised machine learning algorithm, which has the advantages of fast-learning speed, good generalization, high classification performance, and can avoid problems such as local minimum, unreasonable learning rate, excessive number of iterations and overfitting. However, its classification performance is affected by imbalanced training data. To solve this problem, the synthetic minority oversampling technique (SMOTE) was integrated with the ELM algorithm to construct a hybrid algorithm, called SMOTified ELM, to identify polymetallic mineralization anomalies from the 1: 50,000 drainage sediment survey data in the Yeniugou area of Tokexun County, Xinjiang, China. A comparison between the SMOTified ELM model and the ELM model shows that the SMOTified ELM model is superior to the ELM model in terms of receiver operating characteristic curves (ROCs) and area under the (ROC) curves (AUCs). The ROC curve of the SMOTified ELM model is closer to the upper left corner of the ROC space than that of the ELM model. The AUC value of the SMOTfied ELM model (0.963) is higher than that of the ELM model (0.898). The polymetallic mineralization anomalies identified by the SMOTified ELM model account for 10.61% of the study area and contain 100% of known polymetallic deposits. The polymetallic mineralization anomalies identified by the ELM model account for 8.00% of the study area and contain 89% of known polymetallic deposits. Therefore, the SMOTified ELM method is a potentially useful technique for building a supervised mineralization anomaly identification model with high performance.
Similar content being viewed by others
Data availability
The geological and geochemical data are not publicly available due to the confidentiality requirements of the 11th Geological Branch of Geology and Mineral Resources Development Bureau of Xinjiang, China. All the Python source codes are available if any one asked for it.
References
Baglama J, Reichel L (2006) Restarted block Lanczos bidiagonalization methods. Numer Algorithm 43(3):251–272
Birch JB, Tukey JW (1978) Exploratory data analysis. J Am Stat Assoc 73:885–886
Cao Y, Wakil K, Alyousef R et al (2020) Application of extreme learning machine in behavior of beam to column connections. Structures 25:861–867
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357
Chawla NV, Lazarevic A, Hall LO, Bowyer KW (2003) SMOTEBoost: improving prediction of the minority class in boosting. Lect Notes Comput Sci 2838:107–119
Chen YL, An A (2016) Application of ant colony algorithm to geochemical anomaly detection. J Geochem Explor 164:75–85
Chen YL, Shayilan A (2022) Dictionary learning for multivariate geochemical anomaly detection for mineral exploration targeting. J Geochem Explor 235:106958
Chen YL, Wu W (2016) A prospecting cost-benefit strategy for mineral potential mapping based on ROC curve analysis. Ore Geol Rev 74:26–38
Chen YL, Wu W (2017a) Application of one-class support vector machine to quickly identify multivariate anomalies from geochemical exploration data. GEEA 17:231–238
Chen YL, Wu W (2017b) Mapping mineral prospectivity using an extreme learning machine regression. Ore Geol Rev 80:200–213
Chen YL, Wu W (2019a) Isolation forest as an alternative data-driven mineral prospectivity mapping method with a higher data-processing efficiency. Nat Resour Res 28:31–46
Chen YL, Wu W (2019b) Separation of geochemical anomalies from the sample data of unknown distribution population using Gaussian mixture model. Comput Geosci 125:9–18
Chen YL, Lu LJ, Li XB (2014a) Kernel Mahalanobis distance for multivariate geochemical anomaly recognition. J Jilin Univ (Earth Sci) 44:396–408 (In Chinese with English Abstract)
Chen YL, Lu LJ, Li XB (2014b) Application of continuous restricted Boltzmann machine to identify multivariate geochemical anomaly. J Geochem Explor 140:56–63
Chen YL, Sui YH, Shayilan A (2023) Constructing a high-performance self-training model based on support vector classifiers to detect gold mineralization-related geochemical anomalies for gold exploration targeting. Ore Geol Rev 153:105265
Cheng QM (1999) Multifractality and spatial statistics. Comput Geosci 25:949–961
Cheng QM, Agterberg FP, Ballantyne SB (1994) The separation of geochemical anomalies from background by fractal methods. J Geochem Explor 51:109–130
Cheng QM, Agterberg FP, Bonham-Carter GF (1996) A spatial analysis method for geochemical anomaly separation. J Geochem Explor 56:183–195
El-Makky AM (2011) Statistical analyses of La, Ce, Nd, Y, Nb, Ti, P, and Zr in bedrocks and their significance in geochemical exploration at the Um Garayat Gold Mine Area, Eastern Desert. Egypt Natural Resources Research 20:157–176
Freund Y, Schapire R (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139
Gałuszka A (2007) A review of geochemical background concepts and an example using data from Poland. Environ Geol 52:861–870
Grunsky EC, Agterberg FP (1988) Spatial and multivariate analysis of geochemical data from metavolcanic rocks in the Ben Nevis area, Ontario. Math Geol 20:825–861
Han H, Wang WY, Mao BH (2005) Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In: Huang DS, Zhang XP, Huang GB (Eds.): ICIC 2005, Part I, LNCS 3644, pp 878–887
Han CM, Xiao WJ, Wan B et al (2018) Late Palaeozoic-Mesozoic endogenetic metallogenic series and geodynamic evolution in the East Tianshan Mountains. Acta Petrologica Sinica 34(7):1914–1932 (In Chinese with English Abstract)
He H, Yang B, Garcia EA et al (2008) ADASYN: adaptive synthetic sampling approach for imbalanced learning. Neural Networks, 2008. IJCNN 2008. (IEEE World Congress on Computational Intelligence). IEEE International Joint Conference on Computational Intelligence
Huang GB, Zhu QY, Siew CK (2004) Extreme learning machine: a new learning scheme of feedforward neural networks. In: 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.0-4CH37541). IEEE, Budapest, pp 985–990
Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications. Neurocomputing 70:489–501
Huang GB, Ding X, Zhou H (2010) Optimization method based extreme learning machine for classification. Neurocomputing 74:155–163
Huang GB, Wang DH, Lan Y (2011a) Extreme learning machines: a survey. Int J Mach Learn and Cyber 2:107–122
Huang YW, Wu DG, Li J (2011b) Structural healthy monitoring data recovery based on extreme learning machine. Comput Eng 37(16):241–243 (In Chinese with English Abstract)
Li MB, Huang GB, Saratchandran P, Sundararajan N (2005) Fully complex extreme learning machine. Neurocomputing 68:306–314
Liang NY, Saratchandran P, Huang GB, Sundararajan N (2006) Classification of mental tasks from eeg signals using extreme learning machine. Int J Neur Syst 16:29–38
Ma HF, Zhang ZM, Cai GQ et al (2002) Application Geochemical zoning characteristics and prospective prediction of gold deposits in the eastern part of the Southern Tianshan Mountains. Uranium Geology 5:282–286 (In Chinese with English Abstract)
O’Brien JJ, Spry PG, Nettleton D et al (2015) Using random forests to distinguish gahnite compositions as an exploration guide to broken hill-type Pb–Zn–Ag deposits in the Broken Hill domain, Australia. J Geochem Explor 149:74–86
Parsa M, Maghsoudi A, Yousefi M (2018) A receiver operating characteristics-based geochemical data fusion technique for targeting undiscovered mineral deposits. Nat Resour Res 27:15–28
Reichstein M, Camps-Valls G, Stevens B et al (2019) Deep learning and process understanding for data-driven earth system science. Nature 566:195–204
Reimann C, Filzmoser P, Garrett RG (2002) Factor analysis applied to regional geochemical data: problems and possibilities. Appl Geochem 17:185–206
Ren TX, Zhao Y, Zhang H et al (1984) A preliminary study on the utilization of regional geochemical prospecting method in the arid and desert area of Inner Mongolia. Geophysical and Geochemical Exploration 8:284–296 (In Chinese with English Abstract)
Rubio B, Nombela MA, Vilas F (2000) Geochemistry of major and trace elements in sediments of the Ria de Vigo (NW Spain):an assessment of metal pollution. Mar Pollut Bull 40(1):968–980
Shang YM, Lu LJ, Kang QK (2019) Identification model of geochemical anomaly based on isolation forest algorithm. Global Geology 22(3):159–166 (In Chinese with English Abstract)
Si Y, Xu ZP, Gao BM (2011) Study of geophysical prospecting anomaly characteristics in Caihuagou Copper Deposit, Xinjiang Province. Resour Environ Eng 25: 364–367+379 (In Chinese with English Abstract)
Sinclair AJ (1974) Selection of threshold values in geochemical data using probability graphs. J Geochem Explor 3:129–149
Sinclair AJ, Tessari OJ (1981) Vein geochemistry, an exploration tool in KenoHill camp, Yukon Territory, Canada. J Geochem Explor 14:1–24
Suresh S, VenkateshBabu R, Kim HJ (2009) No-reference image quality assessment using modified extreme learning machine classifier. Appl Soft Comput 9:541–552
Swets JA (1988) Measuring the accuracy of diagnostic systems. Science 240:1285–1293
Van HM, Miche Y, Oja E, Lendasse A (2011) GPU-accelerated and parallelized ELM ensembles for large-scale regression. Neurocomputing 74:2430–2437
Wang H, Zuo RG (2015) A comparative study of trend surface analysis and spectrum–area multifractal model to identify geochemical anomalies. J Geochem Explor 155:84–90
Wu W, Chen YL (2018) Application of isolation forest to extract multivariate anomalies from geochemical exploration data. Global Geology 21(1):36–47
Yang L, Li J, Sun YM et al (2022) Analysis of geologic features and genetic type of Liuhuangshan Cu-Pb-Zn polymetallic mine in Toksun, Xinjiang. Chin Min Eng 51:83–88 (In Chinese with English Abstract)
Yeu CWT, Lim MH, Huang GB, Agarwal A, Ong YS (2006) A new machine learning paradigm for terrain reconstruction. IEEE Geosci Remote Sens Lett 3(3):382–386
Zhang RX, Huang GB, Sundararajan N, Saratchandran P (2007) Multicategory classification using an extreme learning machine for microarray gene expression cancer diagnosis. IEEE/ACM Trans Comput Biol Bioinf 4(3):485–495
Zhang ZJ, Zuo RG, Xiong YH (2021) Detection of the multivariate geochemical anomalies associated with mineralization using a deep convolutional neural network and a pixel-pair feature method. Appl Geochem 130:104994
Zuo RG, Cheng QM, Agterberg FP, Xia Q (2009) Application of singularity mapping technique to identify local anomalies using stream sediment geochemical data, a case study from Gangdese, Tibet, western China. J Geochem Explor 101:225–235
Zuo RG, Xiong YH, Wang J, Carranza EJM (2019) Deep learning and its application in geochemical mapping. Earth Sci Rev 192:1–14
Acknowledgements
The authors are grateful to Mss. Min Guo, Mr. Jiaxing Chen and Mr. Sheng He for their kindly help in collecting geochemical exploration data. Geological and geochemical exploration data were provided by the 11th Geological Branch of Geology and Mineral Resources Development Bureau of Xinjiang, China.
Funding
This research was supported by the National Natural Science Foundation of China (Grant no. 42172324).
Author information
Authors and Affiliations
Contributions
Yongliang Chen provided the algorithm model and developed the Python code for the algorithm. Alina Shayilan conducted the case study under the guidance of Yongliang Chen. Alina Shayilan wrote the manuscript and Chen Yongliang edited it.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Shayilan, A., Chen, Y. A SMOTified extreme learning machine for identifying mineralization anomalies from geochemical exploration data: a case study from the Yeniugou area, Xinjiang, China. Earth Sci Inform 17, 1329–1343 (2024). https://doi.org/10.1007/s12145-024-01246-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12145-024-01246-1