Abstract
Lithology identification is an important task in oil and gas exploration. In recent years, machine learning methods have become a powerful tool for intelligent lithology identification. To address the redundancy of conventional logging data and unbalanced distribution among formation lithology classes due to the complexity of depositional environment and inhomogeneity of subsurface space, this paper investigates the affiliation-weighted one-to-one support vector machine (WOVOSVM) lithology identification method based on geochemical logging data. This method uses geochemical logging data, which can directly reflect the formation lithology information, as input, and achieves intelligent and accurate lithology classification under the calculation of WOVOSVM. In this study, Shahezi Formation of Songke 2 Well in Songliao Basin, China is taken as the experimental object, and two data sets with different distribution characteristics are selected as the input. Use WOVOSVM, Adaboost, random forest (RF) and traditional support vector machine (SVM) to identify lithology, and compare and analyze the results. The results are as follows: (1) Accuracy metrics of most of the four classification models were above 60%, indicating the geochemical logging data can effectively reflect the formation lithology information, which is a reliable indicator for the intelligent identification of logging lithology. (2) When the data set has a strong imbalance, the lithology recognition performance of WOVOSVM is better than other methods, the average value of accuracy metrics is more than 72%, F1 value is 8.77% to 14.56% higher than other models, especially in the small sample lithology category recognition, 70% of the samples are correctly classified.










Similar content being viewed by others
Data availability
The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.
References
Al-Anazi AF, Gates ID (2010) Support vector regression for porosity prediction in a heterogeneous reservoir: A comparative study. Comput Geosci 36:1494–1503. https://doi.org/10.1016/j.cageo.2010.03.022
Ao Y, Li H, Zhu L, Ali S, Yang Z (2019) Identifying channel sand-body from multiple seismic attributes with an improved random forest algorithm. J Pet Sci Eng 173:781–792
Asante-Okyere S, Shen C, Ziggah YY, Rulegeya MM, Zhu X (2020) A novel hybrid technique of integrating gradient-boosted machine and clustering algorithms for lithology classification. Nat Resour Res 29:2257–2273
Bao Q, Zhang T, Zhang X, Wang Q, Wei Y, Zhou H (2013) Application of logging lithofacies identification technology in Block A of the Right Bank of the Amu-Darya River. Nat Gas Ind 33:51–55
Breiman L (2001) Random forests. Mach Learning 45, 5-32
Bressan TS, Souza MK, Girelli TJ, Júnior FC (2020) Evaluation of machine learning methods for lithology classification using geophysical data. Comput Geosci 139:104475. https://doi.org/10.1016/j.cageo.2020.104475
Chawla N, Bowyer K, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. ArXiv, abs/1106.1813. https://doi.org/10.1613/jair.953
Daengduang S, Vateekul P (2016) Enhancing accuracy of multi-label classification by applying one-vs-one support vector machine. 2016 13th International Joint Conference on Computer Science and Software Engineering (JCSSE), 1–6
Dong S, Wang Z, Zeng L (2016) Lithology identification using kernel Fisher discriminant analysis with well logs. J Pet Sci Eng 143:95–102
Hill EJ, Fabris A, Uvarova Y, Tiddy C (2021) Improving geological logging of drill holes using geochemical data and data analytics for mineral exploration in the Gawler Ranges, South Australia. Austral J Earth Sci. https://doi.org/10.1080/08120099.2021.1971763
Feng R (2021) Uncertainty analysis in well log classification by Bayesian long short-term memory networks. J Pet Sci Eng 108816. https://doi.org/10.1016/J.PETROL.2021.108816
Feng ZQ, Jia CZ, Xie XN, Zhang S, Feng ZH, Timothy AC (2010) Tectonostratigraphic units and stratigraphic sequences of the nonmarine Songliao basin, northeast China. Basin Res 22:79–95. https://doi.org/10.1111/j.1365-2117.2009.00445.x
Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. Machine Learning: Proceedings of the Thirteenth International Conference 148:156
Galar M, Fernández A, Tartas EB, Bustince H, Herrera F (2017) NMC: nearest matrix classification - A new combination model for pruning One-vs-One ensembles by transforming the aggregation problem. Inf Fusion 36:26–51
Genuer R, Poggi J-M, Tuleau-Malot C, Villa-Vialaneix N (2017) Random forests for big data. Big Data Res 9:28–46
Han R, Wang Z, Wang W, Xu F, Qi X, Cui Y (2021) Lithology identification of igneous rocks based on XGboost and conventional logging curves, a case study of the eastern depression of Liaohe Basin. J Appl Geophys 195:104480
He M, Gu H, Wan H (2020) Log interpretation for lithology and fluid identification using deep neural network combined with MAHAKIL in a tight sandstone reservoir. J Pet Sci Eng 194:107498. https://doi.org/10.1016/j.petrol.2020.107498
Hsu CW, Lin CJ (2002) A comparison of methods for multiclass support vector machines. IEEE Trans Neural Netw 13:415–425
Huo F, Li A, Zhao X, Ren W, Dong H, Yang J (2021) Novel lithology identification method for drilling cuttings under PDC bit condition. J Pet Sci Eng 205:108898. https://doi.org/10.1016/J.PETROL.2021.108898
Konate AA, Pan H, Ma H, Cao X, Ziggah YY, Oloo M, Khan N (2015) Application of dimensionality reduction technique to improve geophysical log data classification performance in crystalline rocks. J Pet Sci Eng 133:633–645
Li X, Li H (2013) A new method of identification of complex lithologies and reservoirs: task-driven data mining. J Pet Sci Eng 109:241–249
Mohammadi NM, Hezarkhani A (2018) Application of support vector machine for the separation of mineralised zones in the Takht-e-Gonbad porphyry deposit, SE Iran. J Afr Earth Sci 143: 301-308
Partopour B, Paffenroth RC, Dixon AG (2018) Random Forests for mapping and analysis of microkinetics models.Comput Chem Eng 115 286-294
Raeesi M, Moradzadeh A, Ardejani FD, Rahimi M (2012) Classification and identification of hydrocarbon reservoir lithofacies and their heterogeneity using seismic attributes, logs data and artificial neural networks. J Pet Sci Eng 82–83:151–165
Ren X, Hou J, Song S, Liu Y, Chen D, Wang X, Dou L (2019) Lithology identification using well logs: a method by integrating artificial neural networks and sedimentary patterns. J Pet Sci Eng. https://doi.org/10.1016/J.PETROL.2019.106336
Ren Q, Zhang H, Zhang D, Zhao X, Yan L, Rui J (2022) A novel hybrid method of lithology identification based on k-means++ algorithm and fuzzy decision tree. J Pet Sci Eng. https://doi.org/10.1016/j.petrol.2021.109681
Salehi SM, Honarvar B (2014) Automatic identification of formation iithology from well log data: a machine learning approach. https://doi.org/10.14355/JPSR.2014.0302.04
Saporetti CM, da Fonseca LG, Pereira E, de Oliveira LC (2018) Machine learning approaches for petrographic classification of carbonate-siliciclastic rocks using well logs and textural information. J Appl Geophys 155:217–225
Saporetti CM, da Fonseca LG, Pereira E (2019) A lithology identification approach based on machine learning with evolutionary parameter tuning. IEEE Geosci Remote Sens Lett 16, 1819–1823
Sebtosheikh MA, Salehi A (2015) Lithology prediction by support vector classifiers using inverted seismic attributes data and petrophysical logs as a new approach and investigation of training data set size effect on its performance in a heterogeneous carbonate reservoir.J Pet Sci Eng 134 143-149
She G, Ma L, Xu Y, Ye G, Mi X, Li C (2015) Reservoir characteristics of oil sands and logging evaluation methods: A case study from Ganchaigou area, Qaidam Basin. Lithologic Reservoirs 27:119–124
Sun F, Yao Y, Chen M, Li X, Zhao L, Meng Y, Sun Z, Zhang T, Feng D (2017) Performance analysis of superheated steam injection for heavy oil recovery and modeling of wellbore heat efficiency. Energy 125:795–804
Sun J, Li Q, Chen M, Ren L, Huang G, Li C, Zhang Z (2019) Optimization of models for a rapid identification of lithology while drilling-A win-win strategy based on machine learning. J Pet Sci Eng 176:321–341
Veropolos K, Campbell C & Cristianini N (1999) Controlling thesensitivity of support vector machines. Proceed Artificial Intell 55–60
Wang C, Feng Z, Zhang L, Huang Y, Cao K, Wang P, Zhao B (2013) Cretaceous paleogeography and paleoclimate and the setting of SKI borehole sites in Songliao Basin, northeast China. Palaeogeogr Palaeoclimatol Palaeoecol 385:17–30. https://doi.org/10.1016/J.PALAEO.2012.01.030
Xi C, Xinai S, Pingyang J, Bin H, Jiang L (2014) Identifying lithology and matrix for unconventional reservoir based on geochemical elements logs. 2014 Fifth International Conference on Intelligent Systems Design and Engineering Applications, pp. 528–532. https://doi.org/10.1109/ISDEA.2014.125
Xie Y, Zhu C, Hu R, Zhu Z (2021) A coarse-to-fine approach for intelligent logging lithology identification with extremely randomized trees. Math Geosci 53:859–876
Xie Y, Zhu C, Zhou W, Li Z, Liu X, Tu M (2018) Evaluation of machine learning methods for formation lithology identification: A comparison of tuning processes and model performances. J Pet Sci Eng 160:182–193
Yi-hua Z, Rong LI (2009) Application of principal component analysis and least square support vector machine to lithology identification. Well Logging Technology 33:425–429
Zerrouki AA, Aifa T, Baddari K (2014) Prediction of natural fracture porosity from well log data by means of fuzzy ranking and an artificial neural network in Hassi Messaoud oil field, Algeria. J Pet Sci Eng 115:78–89
Zhang X, Ding S, Xue Y (2017) An improved multiple birth support vector machine for pattern classification. Neurocomputing 225:119–128
Zheng W, Tian F, Di Q, Xin W, Cheng F, Shan X (2021) Electrofacies classification of deeply buried carbonate strata using machine learning methods: a case study on ordovician paleokarst reservoirs in Tarim Basin. Mar Pet Geol 123. https://doi.org/10.1016/J.MARPETGEO.2020.104720
Funding
This research was funded by [National Natural Science Foundation of China] grant number [41972112, 42272134]; [China Geological Survey] grant number [DD20190010].
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Data collection were collected by Xiang Li and Zhifeng Zhang. The experimental models were constructed by Shitao Yin. The development and the testing of the presented methods were performed by Shitao Yin and Xiaochun Lin. The manuscript was written by Yongjian Huang and Shitao Yin. All authors attend to comment the manuscript, and all authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Communicated by: H. Babaie
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yin, S., Lin, X., Huang, Y. et al. Application of improved support vector machine in geochemical lithology identification. Earth Sci Inform 16, 205–220 (2023). https://doi.org/10.1007/s12145-022-00932-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12145-022-00932-2