Seismic Signal Interpretation for Reservoir Facies Classification

Saikia, Pallabi; Nankani, Deepankar; Baruah, Rashmi Dutta

doi:10.1007/978-3-030-34872-4_45

Pallabi Saikia¹⁴,
Deepankar Nankani¹⁴ &
Rashmi Dutta Baruah¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11942))

Included in the following conference series:

International Conference on Pattern Recognition and Machine Intelligence

1219 Accesses
2 Citations

Abstract

Understanding facies distribution in a hydrocarbon reservoir is an important aspect to characterise a hydrocarbon reservoir. Facies classes are basically based on rock characteristics that can indicate the locations of good quality of sand and presence of hydrocarbon, thereby helping the geologists to decide the location to drill a production well in a hydrocarbon field. However, due to the heterogeneous and nonlinear nature of earth subsurface, gathering facies information becomes a critical task. Researchers from different domains such as Machine learning, Geology, and Geophysics work towards the understanding of facies distribution. However, with the increased complexity of the reservoir, its interpretation becomes difficult. This work describes a case study that involves a framework to classify the facies categories in a reservoir using seismic data by employing different machine learning models. The framework is also capable to handle the data imbalance problem that occurs quite often while studying these kinds of datasets. Moreover, for gaining more confidence in the developed model, we used Local Interpretable Model-Agnostic Explanations to provide the interpretation of the model. The interpretations generated can be helpful for geologists to rate the applicability of our developed model in their domain.

You have full access to this open access chapter, Download conference paper PDF

Lithofacies Control on Reservoir Quality of the Viola Limestone in Southwest Kansas and Unsupervised Machine Learning Approach of Seismic Attributes Facies-Classification

Article 09 May 2019

Seismic facies analysis from well logs based on supervised classification scheme with different machine learning techniques

Article 08 November 2014

Application of multilayer perceptron neural network and seismic multiattribute transforms in reservoir characterization of Pennay field, Niger Delta

Article Open access 09 June 2018

Keywords

1 Introduction

Understanding a reservoir to determine where to drill a production well comes under Reservoir Characterisation (RC) domain. RC [11] aims to identify potential location of hydrocarbon presence by modelling earth’s subsurface. To get most comprehensive understanding of a reservoir, geophysical data sources are utilized in RC. Seismic signals obtained from the seismic survey is the most common geophysical data source that is used to interpret the earth subsurface without a direct penetration into the earth crust. Seismic survey is performed by sending the acoustic signals to the earth crust, which are reflected back by the different layers of earth subsurface. The reflected signals are then recorded, analysed and processed to interpret the underlying subsurface characteristics. Interpretation of the earth subsurface in terms of facies classes is important to distinctly define rocks of interest and to build a better understanding of the depositional environments of the earth subsurface [13]. With the emergence of Machine Learning (ML) in various fields of applications in recent years, many researchers suggested the use of data-driven methods in this domain [2, 6, 12, 16,17,18, 20, 24]. Among the data driven models, the Artificial Neural Network (ANN) is one of the research hot spots in recent years. Many researchers are focusing on how to effectively apply ANN [10, 21, 22] in this field to improve the modelling accuracy. The researchers [8, 10, 14] suggests that neural network has the capability to solve complex calculations and can discover very complex relations within the data that the conventional algorithms often fail to discover. The model has also been successfully applied in petroleum engineering even with sparse data [4, 5]. However, in the literature of facies classification [3, 8, 9, 19], it is observed that facies classification is challenging when we try to interpret from seismic data compared to the interpretation from well logs. This is because of the complex relationship between geophysical data obtained from different sources. Moreover, the uncertainty from seismic data interpretation is higher compared to well logs interpretation as the seismic data cannot see the earth subsurface as the same resolution as well logs see. Hence, in the fields of having limited hydrocarbon wells understanding the relation between seismic and facies can be challenging. Moreover, due to the heterogeneous nature of subsurface, the data obtained can be imbalanced in terms of available classes and building a machine learning model over it can lead to a biased model towards the major classes.

In this work, we are considering a case study over a hydrocarbon field that consists of limited hydrocarbon wells (seven wells), having imbalanced data samples. The task here is to classify the three types of facies (shale, brine sand, and hydrocarbon bearing sand) available over the field from the acquired seismic data. To solve this problem, we developed a framework that consists of: Preprocessing to make the data suitable for the ML model applicability; Modelling using ANN to model the complex relationship between seismic and facies classes, Interpretation of the developed model to understand what made the model work as it is. The strategy of the framework used here can be used by any researcher in this field without the need of having much domain understanding and the end result of the framework can be provided to experts to analyse the model for its applicability in their field. It is worth to specify here that the interpretability of the ML model was never been performed in this field unlike other domains of ML. This is an initial work in this field to the best of our knowledge.

The organisation of this paper is as follows. Section 2 provides the data description. Section 3 provides the detailed description of the methodology followed. Section 4 provides details of experimentation performed with the results obtained. Section 5 provides the conclusions and the future work.

2 Data Description

The dataset we have is a confidential dataset provided by GEOPIC, ONGC, that consists of data from seven wells (Total 36000 instances with each well constituting around 5000 to 6000 instances) in a considered hydrocarbon field. The data consists of fourteen seismic attributes and three classes of facies. Class 0 corresponds to shale, Class 1 corresponds to Brine sand, Class 2 corresponds to Hydrocarbon bearing sand. Fourteen seismic attributes consists of: Amplitude Envelope(X1), Amplitude Weighted Cosine Phase (X2), Amplitude Weighted Frequency (X3), Amplitude Weighted Phase (X4), Derivative (X5), Dominant Frequency (X6), Instantaneous Frequency (X7), Instantaneous Phase (X8), Integrate (X9), Integrated Absolute Amplitude (X10), P-Impedance (X11), Quadrature Trace (X12), Seismic (X13), and VpVs (X14) that are extracted from the main reflected seismic signal recorded with respect to Time (X0). With this data, we need to model the relation between the seismic attributes and facies, to correctly identify the facies classes in the hydrocarbon field from seismic data.

3 Methodology

The proposed framework is provided in Fig. 1. The steps consists of Preprocessing, Modelling, and Interpretation. Preprocessing is performed to handle the imbalanced dataset by oversampling the minority classes and undersampling the majority class data, normalisation of the input (seismic) attributes using Z-score normalisation, and converting the output (facies classes) to the one hot encoding to make the data suitable for ML models. The modeling phase consists of applying ANN with different ML models for the comparisons in modelling the complex relationship between the input (seismic) and the output (facies class). The last step, interpretability is performed using Local Interpretable Model-Agnostic Explanations to understand the working of a model. The detailed description and motivation for each step of the framework is provided in Subsects. 3.1 to 3.3.

3.1 Preprocessing

Table 1. Description of classes before and after resampling

Full size table

It can be observed from Table 1 that the original dataset distribution depicts a high bias towards class 0 of around 85% whereas other two classes in combined contributed only 14% of the total data. If this dataset is used to build a model, the model will be highly biased towards majority class. This motivated us to use a resampling method. The resampling of the dataset is performed with Synthetic Minority Oversampling Technique (SMOTE) [7]. This is a popular technique in literature to handle imbalance dataset. The technique is used to reduce the imbalance in the dataset by synthetically generating minority class instances based on the feature space similarities between existing minority instances. Table 1 shows the dataset distribution after resampling. SMOTE is used to generate more samples from Class 1 and Class 2. SMOTE alone is not sufficient to sample the data effectively as generating too many samples of a class will lead to dominance on noise in the dataset. Therefore, we also implement random under-sampling on Class 0 to reduce its samples and make a lesser biased dataset [1].

3.2 Modelling

For modelling, we applied various state of the art methods of classification and compared their results. However, due to the state of art performance of Neural Networks as provided in the literature, we preferred neural network over other ML models and compared it with Naive Bayes (NB), Support Vector Machine (SVM), Decision Tree (DT), and Random Forest (RF), to see how effectively they model the relation between seismic and facies. The performance of ML algorithms are typically evaluated using predictive accuracy (Acc). However, this is not appropriate when the data is imbalanced. Hence, the model performance is evaluated using other important evaluation metrics, sensitivity (Se), specificity (Sp) and precision (Pr), suitable for imbalanced dataset.

3.3 Interpretation

The last step of our framework explains the prediction of a model using Local Interpretable Model-Agnostic Explanations (LIME) [15]. LIME explains the model by perturbing the input data and correlating the effect of these changes on predictions made by the model. One advantage of using LIME is that it is model-agnostic technique and can be applied to understand the explanations of any model. Other model-specific methods try to understand the model by studying the internal components and their behavior. For example, investigation of activation units and linking of internal activations to inputs are performed in case of most deep learning models interpretability [23]. This requires deep knowledge of the model and is not applicable to other models. LIME generates a list of explanations, showing the effect of each feature on the output. The explanations are created by using interpretable models like decision trees and linear models. The interpretable models are trained in close vicinity of data instance i.e. understanding complex models by application of simpler models locally. With the use of LIME, we can provide the explanations of our model predictions to the geoscience experts to determine whether the model explanations satisfied their requirements for its deployment in the real world.

4 Experimentation and Results

To build and study the performance of models we first take one well aside (Well 1 with 5645 samples), to use it as test data to evaluate the performance of the models on unknown location. We then combine the data of the remaining six wells (31355 samples as train set) to train the models. To evaluate model performance we keep 30% of train data as validation set. Several runs of training, testing, and validation phases have been carried out in order to decide the best set of hyperparameters and parameters for every ML model. Figure 2 shows the performance on our dataset with resampling as well as without resampling with the considered ML models. NB performed poorly in validation dataset, however its performance is better in test dataset, in comparison to other models. On the contrary, DT and RF performed remarkably better (almost 100%) on the validation data but performed very poor on the test dataset. It looks a classic case of overfitting that the models could not generalise to the samples that belong to a different well from training wells. Coming to the ANN model performance, its performance is not as worse as NB in validation set, and does not overfit like DT and RF. Hence, the model is able to outperform all these models on the test dataset. We can also observe that ANN with more number of hidden layers (also called as deep neural network (DNN)) is able to improve accuracy on the test set but could not provide better Precision, Sensitivity and Specificity compared to its shallow version. This can be due to the more number of layers in DNN helping towards improvement of accuracy due to further improvement in loss function, but it is ignoring other important performance criteria of the model in imbalance dataset. However, in the scenario of balanced dataset, DNN has got the capability to improve performance in every aspects. Hence we can infer that ANN approximated a very good relation between seismic and facies classes as they can build more complex boundaries which is required for a real world dataset. ANNs generalises well, which can be seen as the improved accuracy of the test dataset. The above observations are true for both, with sampling as well as without sampling. However, better performance is obtained with the resampled dataset. The accuracy of the models remained almost similar, however, the sensitivity and specificity improved by almost 3% to 6%, which are the major evaluation criteria for imbalanced dataset. From the analysis, we came to a conclusion that ANN provided the best performance. However, being a black box model, the geoscientists find it difficult to gain trust in the model to make it deployable for the real world. So, we have used LIME interpretation as provided in Fig. 3 to explain about the predictions of the model. Figure 3 presents one explanation for each of the classes using LIME. The variables occurring in the right side of the axis represents positive contribution, whereas the variables occurring in the left side represents negative contribution. From the interpretation result of Class 0, it can be seen that features X11, X6, X0, X9, X1 and X2 contributes positively, whereas feature X13 and X10 contribute negatively for the prediction of Shale class. Other attributes have very negligible contribution or no contribution at all. Similarly, the result of model prediction belonging to Class 1, depicts that features X11, X6, and X10 contribute positively, whereas X0, X4, and X12 contribute negatively to the classification of this instance to Brine Sand Class. The result of model prediction belonging to Class 2 depicts that features X9, X11, and X8 contribute positively, whereas X12 and X4 contribute negatively to the classification of this instance to Hydrocarbon Sand Class. Moreover, Values of these features belonging to a certain range (as specified) is the sole reason behind classifying these instances to a particular class. This kind of interpretation can be helpful for geoscientists to validate if the reason for the classification of the instance should be trusted or not, and hence they can accordingly decide the models applicability for the real world.

5 Conclusion and Future Work

This paper presents a data driven based framework on facies classification. The challenge of having imbalanced classes in the data is successfully handled with an upsampling method, namely SMOTE. The method effectively deals with bias in the dataset towards specific classes. ANN is built for the purpose of modelling complex relationship within the data, provided the best performance measure on the test data that belongs to a completely new well. As important decisions on drilling a new well have to be taken based on model predictions, this paper also presented LIME results to interpret the predictions made by the model. It is observed from the interpretation that, not all seismic attributes are affected by facies characteristics. Some facies classes effect some seismic attributes more than the other. This is a very initial work on interpretation in this field. However, more analysis can be performed on interpretation of model that comprises the domain knowledge. Also, it’s worth mentioning that the composition beneath the earth subsurface varies greatly from location to location that limits the applicability of a model from one field to another, that opens new avenues for more research on modelling the complex underlying relation in this field.

References

Agrawal, A., Viktor, H.L., Paquet, E.: Scut: multi-class imbalanced data classification using smote and cluster-based undersampling. In: International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, vol. 1, pp. 226–234. IEEE (2015)
Google Scholar
Al-Anazi, A., Gates, I.: On the capability of support vector machines to classify lithology from well logs. Nat. Resour. Res. 19(2), 125–139 (2010)
Article Google Scholar
Ashraf, U., et al.: Classification of reservoir facies using well log and 3D seismic attributes for prospect evaluation and field development: a case study of Sawan gas field, Pakistan. J. Petrol. Sci. Eng. 175, 338–351 (2019)
Article Google Scholar
Auda, G., Kamel, M.: Modular neural networks: a survey. Int. J. Neural Syst. 9(02), 129–151 (1999)
Article Google Scholar
Ayala, L.F., Ertekin, T.: Analysis of gas-cycling performance in gas/condensate reservoirs using neuro-simulation. In: SPE Annual Technical Conference and Exhibition. Society of Petroleum Engineers (2005)
Google Scholar
Bhattacharya, S., Mishra, S.: Applications of machine learning for facies and fracture prediction using Bayesian network theory and random forest: case studies from the appalachian basin, USA. J. Petrol. Sci. Eng. 170, 1005–1017 (2018)
Article Google Scholar
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
Article Google Scholar
Dubois, M.K., Bohling, G.C., Chakrabarti, S.: Comparison of four approaches to a rock facies classification problem. Comput. Geosci. 33(5), 599–617 (2007)
Article Google Scholar
Ferreira, D.J.A., et al.: Unsupervised seismic facies classification applied to a presalt carbonate reservoir, santos basin, offshore Brazil. AAPG Bull. 103(4), 997–1012 (2019)
Google Scholar
Imamverdiyev, Y., Sukhostat, L.: Lithological facies classification using deep convolutional neural network. J. Petrol. Sci. Eng. 174, 216–228 (2019)
Article Google Scholar
Lines, L.R., Newrick, R.T.: Fundamentals of geophysical interpretation. Society of Exploration Geophysicists (2004)
Google Scholar
Mishra, S., Datta-Gupta, A.: Applied Statistical Modeling and Data Analytics: A Practical Guide for the Petroleum Geosciences. Elsevier, Amsterdam (2017)
Chapter Google Scholar
Mur, A., Waters, K.: Play scale seismic characterisation-using basin models as an input to seismic characterisation in new and emerging PLA. In: 80th EAGE Conference and Exhibition (2018)
Google Scholar
Nakutnyy, P., Asghari, K., Torn, A., et al.: Analysis of waterflooding through application of neural networks. In: Canadian International Petroleum Conference. Petroleum Society of Canada (2008)
Google Scholar
Ribeiro, M.T., Singh, S., Guestrin, C.: Why should i trust you?: explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1135–1144. ACM (2016)
Google Scholar
Salehi, S.M., Honarvar, B.: Automatic identification of formation lithology from well log data: a machine learning approach. J. Petrol. Sci. Res. 3(2), 73–82 (2014)
Article Google Scholar
Sebtosheikh, M.A., Salehi, A.: Lithology prediction by support vector classifiers using inverted seismic attributes data and petrophysical logs as a new approach and investigation of training data set size effect on its performance in a heterogeneous carbonate reservoir. J. Petrol. Sci. Eng. 134, 143–149 (2015)
Article Google Scholar
Silva, A.A., Neto, I.A.L., Misságia, R.M., Ceia, M.A., Carrasquilla, A.G., Archilha, N.L.: Artificial neural networks to support petrographic classification of carbonate-siliciclastic rocks using well logs and textural information. J. Appl. Geophys. 117, 118–125 (2015)
Article Google Scholar
Vashist, N., Dennis, R., Rajvanshi, A., Taneja, H., Walia, R., Sharma, P.: Reservoir facies and their distribution in a heterogeneous carbonate reservoir: an integrated approach. In: SPE Annual Technical Conference and Exhibition. Society of Petroleum Engineers (1993)
Google Scholar
Wang, G., Carr, T.R., Ju, Y., Li, C.: Identifying organic-rich Marcellus Shale lithofacies by support vector machine classifier in the Appalachian basin. Comput. Geosci. 64, 52–60 (2014)
Article Google Scholar
Wei, Z., Hu, H., Zhou, H.W., Lau, A.: Characterizing rock facies using machine learning algorithm based on a convolutional neural network and data padding strategy. Pure Appl. Geophys. 1–13 (2019)
Google Scholar
Zhang, D., Yuntian, C., Jin, M.: Synthetic well logs generation via recurrent neural networks. Petrol. Explor. Dev. 45(4), 629–639 (2018)
Article Google Scholar
Zhang, Q.S., Zhu, S.C.: Visual interpretability for deep learning: a survey. Front. Inf. Technol. Electron. Eng. 19(1), 27–39 (2018)
Article Google Scholar
Zhao, T., Jayaram, V., Roy, A., Marfurt, K.J.: A comparison of classification techniques for seismic facies recognition. Interpretation 3(4), SAE29–SAE58 (2015)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Indian Institute of Technology Guwahati, Guwahati, Assam, India
Pallabi Saikia, Deepankar Nankani & Rashmi Dutta Baruah

Authors

Pallabi Saikia
View author publications
You can also search for this author in PubMed Google Scholar
Deepankar Nankani
View author publications
You can also search for this author in PubMed Google Scholar
Rashmi Dutta Baruah
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pallabi Saikia .

Editor information

Editors and Affiliations

Tezpur University, Tezpur, India
Bhabesh Deka
Indian Statistical Institute, Kolkata, India
Pradipta Maji
Indian Statistical Institute, Kolkata, India
Sushmita Mitra
Tezpur University, Tezpur, India
Dhruba Kumar Bhattacharyya
Indian Institute of Technology Guwahati, Guwahati, India
Prabin Kumar Bora
Indian Statistical Institute, Kolkata, India
Sankar Kumar Pal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Saikia, P., Nankani, D., Baruah, R.D. (2019). Seismic Signal Interpretation for Reservoir Facies Classification. In: Deka, B., Maji, P., Mitra, S., Bhattacharyya, D., Bora, P., Pal, S. (eds) Pattern Recognition and Machine Intelligence. PReMI 2019. Lecture Notes in Computer Science(), vol 11942. Springer, Cham. https://doi.org/10.1007/978-3-030-34872-4_45

Download citation

DOI: https://doi.org/10.1007/978-3-030-34872-4_45
Published: 25 November 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34871-7
Online ISBN: 978-3-030-34872-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)