Skip to main content

Advertisement

Log in

A SMOTified extreme learning machine for identifying mineralization anomalies from geochemical exploration data: a case study from the Yeniugou area, Xinjiang, China

  • RESEARCH
  • Published:
Earth Science Informatics Aims and scope Submit manuscript

Abstract

Extreme learning Machine (ELM) is a novel supervised machine learning algorithm, which has the advantages of fast-learning speed, good generalization, high classification performance, and can avoid problems such as local minimum, unreasonable learning rate, excessive number of iterations and overfitting. However, its classification performance is affected by imbalanced training data. To solve this problem, the synthetic minority oversampling technique (SMOTE) was integrated with the ELM algorithm to construct a hybrid algorithm, called SMOTified ELM, to identify polymetallic mineralization anomalies from the 1: 50,000 drainage sediment survey data in the Yeniugou area of Tokexun County, Xinjiang, China. A comparison between the SMOTified ELM model and the ELM model shows that the SMOTified ELM model is superior to the ELM model in terms of receiver operating characteristic curves (ROCs) and area under the (ROC) curves (AUCs). The ROC curve of the SMOTified ELM model is closer to the upper left corner of the ROC space than that of the ELM model. The AUC value of the SMOTfied ELM model (0.963) is higher than that of the ELM model (0.898). The polymetallic mineralization anomalies identified by the SMOTified ELM model account for 10.61% of the study area and contain 100% of known polymetallic deposits. The polymetallic mineralization anomalies identified by the ELM model account for 8.00% of the study area and contain 89% of known polymetallic deposits. Therefore, the SMOTified ELM method is a potentially useful technique for building a supervised mineralization anomaly identification model with high performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availability

The geological and geochemical data are not publicly available due to the confidentiality requirements of the 11th Geological Branch of Geology and Mineral Resources Development Bureau of Xinjiang, China. All the Python source codes are available if any one asked for it.

References

  • Baglama J, Reichel L (2006) Restarted block Lanczos bidiagonalization methods. Numer Algorithm 43(3):251–272

    ADS  MathSciNet  Google Scholar 

  • Birch JB, Tukey JW (1978) Exploratory data analysis. J Am Stat Assoc 73:885–886

    Google Scholar 

  • Cao Y, Wakil K, Alyousef R et al (2020) Application of extreme learning machine in behavior of beam to column connections. Structures 25:861–867

    Google Scholar 

  • Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357

    Google Scholar 

  • Chawla NV, Lazarevic A, Hall LO, Bowyer KW (2003) SMOTEBoost: improving prediction of the minority class in boosting. Lect Notes Comput Sci 2838:107–119

    Google Scholar 

  • Chen YL, An A (2016) Application of ant colony algorithm to geochemical anomaly detection. J Geochem Explor 164:75–85

    CAS  Google Scholar 

  • Chen YL, Shayilan A (2022) Dictionary learning for multivariate geochemical anomaly detection for mineral exploration targeting. J Geochem Explor 235:106958

    CAS  Google Scholar 

  • Chen YL, Wu W (2016) A prospecting cost-benefit strategy for mineral potential mapping based on ROC curve analysis. Ore Geol Rev 74:26–38

    Google Scholar 

  • Chen YL, Wu W (2017a) Application of one-class support vector machine to quickly identify multivariate anomalies from geochemical exploration data. GEEA 17:231–238

    CAS  Google Scholar 

  • Chen YL, Wu W (2017b) Mapping mineral prospectivity using an extreme learning machine regression. Ore Geol Rev 80:200–213

    Google Scholar 

  • Chen YL, Wu W (2019a) Isolation forest as an alternative data-driven mineral prospectivity mapping method with a higher data-processing efficiency. Nat Resour Res 28:31–46

    ADS  Google Scholar 

  • Chen YL, Wu W (2019b) Separation of geochemical anomalies from the sample data of unknown distribution population using Gaussian mixture model. Comput Geosci 125:9–18

    ADS  Google Scholar 

  • Chen YL, Lu LJ, Li XB (2014a) Kernel Mahalanobis distance for multivariate geochemical anomaly recognition. J Jilin Univ (Earth Sci) 44:396–408 (In Chinese with English Abstract)

    Google Scholar 

  • Chen YL, Lu LJ, Li XB (2014b) Application of continuous restricted Boltzmann machine to identify multivariate geochemical anomaly. J Geochem Explor 140:56–63

    CAS  Google Scholar 

  • Chen YL, Sui YH, Shayilan A (2023) Constructing a high-performance self-training model based on support vector classifiers to detect gold mineralization-related geochemical anomalies for gold exploration targeting. Ore Geol Rev 153:105265

    Google Scholar 

  • Cheng QM (1999) Multifractality and spatial statistics. Comput Geosci 25:949–961

    ADS  Google Scholar 

  • Cheng QM, Agterberg FP, Ballantyne SB (1994) The separation of geochemical anomalies from background by fractal methods. J Geochem Explor 51:109–130

    CAS  Google Scholar 

  • Cheng QM, Agterberg FP, Bonham-Carter GF (1996) A spatial analysis method for geochemical anomaly separation. J Geochem Explor 56:183–195

    CAS  Google Scholar 

  • El-Makky AM (2011) Statistical analyses of La, Ce, Nd, Y, Nb, Ti, P, and Zr in bedrocks and their significance in geochemical exploration at the Um Garayat Gold Mine Area, Eastern Desert. Egypt Natural Resources Research 20:157–176

    CAS  Google Scholar 

  • Freund Y, Schapire R (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139

    MathSciNet  Google Scholar 

  • Gałuszka A (2007) A review of geochemical background concepts and an example using data from Poland. Environ Geol 52:861–870

    ADS  Google Scholar 

  • Grunsky EC, Agterberg FP (1988) Spatial and multivariate analysis of geochemical data from metavolcanic rocks in the Ben Nevis area, Ontario. Math Geol 20:825–861

    CAS  Google Scholar 

  • Han H, Wang WY, Mao BH (2005) Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. In: Huang DS, Zhang XP, Huang GB (Eds.): ICIC 2005, Part I, LNCS 3644, pp 878–887

  • Han CM, Xiao WJ, Wan B et al (2018) Late Palaeozoic-Mesozoic endogenetic metallogenic series and geodynamic evolution in the East Tianshan Mountains. Acta Petrologica Sinica 34(7):1914–1932 (In Chinese with English Abstract)

    Google Scholar 

  • He H, Yang B, Garcia EA et al (2008) ADASYN: adaptive synthetic sampling approach for imbalanced learning. Neural Networks, 2008. IJCNN 2008. (IEEE World Congress on Computational Intelligence). IEEE International Joint Conference on Computational Intelligence

  • Huang GB, Zhu QY, Siew CK (2004) Extreme learning machine: a new learning scheme of feedforward neural networks. In: 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.0-4CH37541). IEEE, Budapest, pp 985–990

  • Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications. Neurocomputing 70:489–501

    Google Scholar 

  • Huang GB, Ding X, Zhou H (2010) Optimization method based extreme learning machine for classification. Neurocomputing 74:155–163

    Google Scholar 

  • Huang GB, Wang DH, Lan Y (2011a) Extreme learning machines: a survey. Int J Mach Learn and Cyber 2:107–122

    Google Scholar 

  • Huang YW, Wu DG, Li J (2011b) Structural healthy monitoring data recovery based on extreme learning machine. Comput Eng 37(16):241–243 (In Chinese with English Abstract)

    Google Scholar 

  • Li MB, Huang GB, Saratchandran P, Sundararajan N (2005) Fully complex extreme learning machine. Neurocomputing 68:306–314

    Google Scholar 

  • Liang NY, Saratchandran P, Huang GB, Sundararajan N (2006) Classification of mental tasks from eeg signals using extreme learning machine. Int J Neur Syst 16:29–38

    Google Scholar 

  • Ma HF, Zhang ZM, Cai GQ et al (2002) Application Geochemical zoning characteristics and prospective prediction of gold deposits in the eastern part of the Southern Tianshan Mountains. Uranium Geology 5:282–286 (In Chinese with English Abstract)

    Google Scholar 

  • O’Brien JJ, Spry PG, Nettleton D et al (2015) Using random forests to distinguish gahnite compositions as an exploration guide to broken hill-type Pb–Zn–Ag deposits in the Broken Hill domain, Australia. J Geochem Explor 149:74–86

    Google Scholar 

  • Parsa M, Maghsoudi A, Yousefi M (2018) A receiver operating characteristics-based geochemical data fusion technique for targeting undiscovered mineral deposits. Nat Resour Res 27:15–28

    CAS  Google Scholar 

  • Reichstein M, Camps-Valls G, Stevens B et al (2019) Deep learning and process understanding for data-driven earth system science. Nature 566:195–204

    ADS  CAS  PubMed  Google Scholar 

  • Reimann C, Filzmoser P, Garrett RG (2002) Factor analysis applied to regional geochemical data: problems and possibilities. Appl Geochem 17:185–206

    ADS  CAS  Google Scholar 

  • Ren TX, Zhao Y, Zhang H et al (1984) A preliminary study on the utilization of regional geochemical prospecting method in the arid and desert area of Inner Mongolia. Geophysical and Geochemical Exploration 8:284–296 (In Chinese with English Abstract)

    CAS  Google Scholar 

  • Rubio B, Nombela MA, Vilas F (2000) Geochemistry of major and trace elements in sediments of the Ria de Vigo (NW Spain):an assessment of metal pollution. Mar Pollut Bull 40(1):968–980

    CAS  Google Scholar 

  • Shang YM, Lu LJ, Kang QK (2019) Identification model of geochemical anomaly based on isolation forest algorithm. Global Geology 22(3):159–166 (In Chinese with English Abstract)

    Google Scholar 

  • Si Y, Xu ZP, Gao BM (2011) Study of geophysical prospecting anomaly characteristics in Caihuagou Copper Deposit, Xinjiang Province. Resour Environ Eng 25: 364–367+379 (In Chinese with English Abstract)

  • Sinclair AJ (1974) Selection of threshold values in geochemical data using probability graphs. J Geochem Explor 3:129–149

    CAS  Google Scholar 

  • Sinclair AJ, Tessari OJ (1981) Vein geochemistry, an exploration tool in KenoHill camp, Yukon Territory, Canada. J Geochem Explor 14:1–24

    CAS  Google Scholar 

  • Suresh S, VenkateshBabu R, Kim HJ (2009) No-reference image quality assessment using modified extreme learning machine classifier. Appl Soft Comput 9:541–552

    Google Scholar 

  • Swets JA (1988) Measuring the accuracy of diagnostic systems. Science 240:1285–1293

    ADS  MathSciNet  CAS  PubMed  Google Scholar 

  • Van HM, Miche Y, Oja E, Lendasse A (2011) GPU-accelerated and parallelized ELM ensembles for large-scale regression. Neurocomputing 74:2430–2437

    Google Scholar 

  • Wang H, Zuo RG (2015) A comparative study of trend surface analysis and spectrum–area multifractal model to identify geochemical anomalies. J Geochem Explor 155:84–90

    CAS  Google Scholar 

  • Wu W, Chen YL (2018) Application of isolation forest to extract multivariate anomalies from geochemical exploration data. Global Geology 21(1):36–47

    Google Scholar 

  • Yang L, Li J, Sun YM et al (2022) Analysis of geologic features and genetic type of Liuhuangshan Cu-Pb-Zn polymetallic mine in Toksun, Xinjiang. Chin Min Eng 51:83–88 (In Chinese with English Abstract)

    Google Scholar 

  • Yeu CWT, Lim MH, Huang GB, Agarwal A, Ong YS (2006) A new machine learning paradigm for terrain reconstruction. IEEE Geosci Remote Sens Lett 3(3):382–386

    ADS  Google Scholar 

  • Zhang RX, Huang GB, Sundararajan N, Saratchandran P (2007) Multicategory classification using an extreme learning machine for microarray gene expression cancer diagnosis. IEEE/ACM Trans Comput Biol Bioinf 4(3):485–495

    CAS  Google Scholar 

  • Zhang ZJ, Zuo RG, Xiong YH (2021) Detection of the multivariate geochemical anomalies associated with mineralization using a deep convolutional neural network and a pixel-pair feature method. Appl Geochem 130:104994

    CAS  Google Scholar 

  • Zuo RG, Cheng QM, Agterberg FP, Xia Q (2009) Application of singularity mapping technique to identify local anomalies using stream sediment geochemical data, a case study from Gangdese, Tibet, western China. J Geochem Explor 101:225–235

    CAS  Google Scholar 

  • Zuo RG, Xiong YH, Wang J, Carranza EJM (2019) Deep learning and its application in geochemical mapping. Earth Sci Rev 192:1–14

    ADS  Google Scholar 

Download references

Acknowledgements

The authors are grateful to Mss. Min Guo, Mr. Jiaxing Chen and Mr. Sheng He for their kindly help in collecting geochemical exploration data. Geological and geochemical exploration data were provided by the 11th Geological Branch of Geology and Mineral Resources Development Bureau of Xinjiang, China.

Funding

This research was supported by the National Natural Science Foundation of China (Grant no. 42172324).

Author information

Authors and Affiliations

Authors

Contributions

Yongliang Chen provided the algorithm model and developed the Python code for the algorithm. Alina Shayilan conducted the case study under the guidance of Yongliang Chen. Alina Shayilan wrote the manuscript and Chen Yongliang edited it.

Corresponding author

Correspondence to Yongliang Chen.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shayilan, A., Chen, Y. A SMOTified extreme learning machine for identifying mineralization anomalies from geochemical exploration data: a case study from the Yeniugou area, Xinjiang, China. Earth Sci Inform 17, 1329–1343 (2024). https://doi.org/10.1007/s12145-024-01246-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12145-024-01246-1

Keywords

Navigation