Research paper
A new procedure for generating data covariance inflation factors for ensemble smoother with multiple data assimilation

https://doi.org/10.1016/j.cageo.2021.104722Get rights and content

Highlights

  • We propose a new method to generate ES-MDA inflation factors.

  • The first and last factors are computed using the method of Rafiee and Reynolds.

  • The other inflation factors are computed geometrically in decreasing order.

  • The new method achieves better ES-MDA outcomes with fewer assimilations.

Abstract

The ensemble smoother with multiple data assimilation (ES-MDA) has gained much attention as a powerful tool for history matching problems. Previous studies showed that it could provide both a good match of data and estimates of model parameters. In the original ES-MDA formulation, the number of data assimilation and covariance inflation factors are determined in advance. Selecting them in a decreasing order may improve the final results. Moreover, recent studies propose some theoretical and practical methods to select inflation factors based on the discrepancy principle. This work aims to introduce a new method for generating the data covariance inflation factors for ES-MDA. In the new method, the first inflation factor is generated using a Levenberg–Marquardt regularizing scheme. The last inflation factor is set by a parameter that limits its magnitude, computed using the singular values of the dimensionless sensitivity matrix estimated from the prior ensemble. As a result, the method computes the correct number of data assimilations that produces inflation factors such that the sum of their inverse is equal to one, as required by ES-MDA. It is shown through a synthetic two-dimensional water flooding history matching problem that the proposed methodology achieves both better model parameter match and data match with a smaller number of assimilations than the methods available in the literature.

Introduction

Ensemble-based methods have become a very popular approach for history matching problems. Their easy implementation and providing both a good match of observed data and a reasonable estimate of model parameters must be the reason for their popularity (Oliver et al., 2008). Also, they do not require to compute any derivative to determine the sensitivity matrix. The Ensemble Kalman Filter (EnKF) (Evensen, 2003) might be the most famous ensemble-based method, with a large number of published researches (Coutinho et al., 2010, Chen and Oliver, 2010, Emerick and Reynolds, 2011). However, EnKF assimilates data sequentially in time, which may result in high computational cost. In practical reservoir simulation applications, the simulator needs to be restarted at every time step, which may increase time consumption (Emerick and Reynolds, 2013a).

van Leeuwen and Evensen (1996) proposed the Ensemble Smoother (ES), which updates the vector of model parameters only once by running the simulator from time zero until the end of the historical period. Both EnKF and ES try to characterize the posterior probability density function by estimating the first two statistical moments (Rafiee and Reynolds, 2017, Oliver et al., 2008). Nevertheless, the single update computed by ES seems to be unable to provide a good match of data as well as acceptable model parameters estimates (Emerick and Reynolds, 2013b). It can be reinforced by the study of Reynolds et al. (2006), which proves that applying EnKF, at each time step, and, consequently, ES, is similar to a full-step Gauss–Newton method (Nocedal and Wright, 2006) using an average sensitivity matrix estimated from the prior ensemble, which has been proven to be ineffective to produce satisfactory results (Shiranji and Emerick, 2016).

Many iterative forms of the ES have been proposed to improve its primary formulation (Chen and Oliver, 2012, Chen and Oliver, 2013, Iglesias, 2015). One of them is the ensemble smoother with multiple data assimilation (ES-MDA) (Emerick and Reynolds, 2013a), which has proved to be efficient in highly nonlinear history matching problems (Emerick and Reynolds, 2013b, Emerick, 2017, Canchumuni et al., 2019). The idea of the ES-MDA is to assimilate the same data multiple times with an inflated data error covariance matrix. Emerick and Reynolds (2013a) proved that the ES-MDA obtains a correct sample of the posterior probability function for the linear-Gaussian case if the sum of the inverse of the inflation factors is equal to one. Thus, a simple choice is selecting them equal to the number of data assimilations. They also observed that selecting the inflation factors in a decreasing order may improve the results of ES-MDA. Silva et al. (2019) compared the results obtained by the ES and ES-MDA and concluded that ES-MDA with constant inflation factors provides a better match of data than the regular ES.

The problem of selecting all the inflation factors equal to the number of data assimilations is that if it is not enough to provide reasonable results, one may need to restart the assimilation process with a higher number of assimilations, which increases computational time. Another problem is that it may cause overcorrections of the model parameters (Emerick, 2019, Rafiee and Reynolds, 2017, Le et al., 2016). In a study presented by Le et al. (2016), selecting all the inflation factors equal to the number of assimilations led to overshooting in the final permeability and porosity field when implementing the ES-MDA with 8 and 16 assimilation steps. Therefore, they proposed two adaptive procedures for selecting inflation factors. The first one, ES-MDA-RS, is based on the average data mismatch function, determining inflation factors that do not result in great changes in model parameters at each assimilation step. The second method, ES-MDA-RLM, is based on a study proposed by Iglesias and Dawson (2013), that uses a regularizing scheme proposed by Hanke (1997) to compute the inflation factors at each assimilation step. Although the methods proposed by Le et al. (2016) improved the performance of the ES-MDA, it often requires many iterations, which may hinder the application for large-scale problems (Rafiee and Reynolds, 2017). Emerick (2016) proposed a method to select the inflation factors adaptatively, based on the average data mismatch function multiplied by a factor smaller than one. The termination criterion is when the sum of the inverse of the inflation factors becomes equal to one. Emerick (2016) tested the method in a real field case. However, the results were not significantly better than the standard ES-MDA.

Motivated by the studies of Le et al. (2016) and Iglesias, 2015, Rafiee and Reynolds, 2017 presented an ES-MDA algorithm where the first inflation factor is computed based on the regularization condition for Levenberg–Marquardt algorithms of Hanke (1997). One of their objectives was to determine the number of assimilation a priori to maintain low computational cost and assimilation time. They state that selecting the number of assimilations from 4 to 8, with proper inflation factors, produce good results for the final ensemble. They also declared that selecting the inflation factors geometrically in decreasing order is good to improve ES-MDA results. In the study of Silva et al. (2021), increasing the number of assimilation from 4 to 8 did not improve the ES-MDA outcomes significantly when using the method of Rafiee and Reynolds (2017). However, the first inflation factor computed by Rafiee and Reynolds (2017) is often large. It implies that when the number of assimilations is close to 4, the last inflation factor is close to 1, i.e., the last assimilation is almost full-step.

Emerick (2019) presented an ES-MDA algorithm that determines the inflation factor for the last iteration. Then, the algorithm computes the previous inflation factors geometrically in increasing order such that the sum of the inverse of them is equal to one. Computed value for the first inflation factor is tested against the Morozov Discrepancy Principle (MDP) (Morozov, 1984). If it is not satisfied, the method increases the number of assimilations in one and recomputes all the inflation factors again. However, recent studies (Le et al., 2016, Rafiee and Reynolds, 2017) observed that the ES-MDA update equation has a similar structure as a Levenberg–Marquardt algorithm. Therefore, the method of Hanke (1997) can be applied to select the inflation factors (Le et al., 2016). Moreover, as presented in Hanke (2010), the scheme for nonlinear inverse problems of Hanke (1997) is of optimal order, i.e., it provides optimal accuracy for such algorithms.

In this study, we present a novel method for generating the inflation factors for ES-MDA. We apply the analytical formula derived by Rafiee and Reynolds (2017) to determine a lower bound for the first inflation factor and use the same equation to propose the inflation factor’s computation for the last assimilation step using the singular values of the average sensitivity matrix estimated from the prior ensemble. In the calculation of the last inflation factor, we limit its magnitude to a previously selected threshold. Although we can mathematically prove such a procedure only for the linear-Gaussian case, the numerical examples presented in this study demonstrate that the proposed method is adequate for the nonlinear case. The other inflation factors are computed geometrically in decreasing order. The proposed algorithm then computes the correct number of data assimilations that produce inflation factors satisfying ES-MDA requirements. In addition, as the method of Emerick (2019) has no efficient procedure to compute the last inflation factor, the proposed analytical formula introduced by this study can be used in the algorithm of Emerick (2019).

The motivations for using the proposed method are the following: (i) if the number of assimilation runs is small (e.g., close to four), the last assimilation of the method of Rafiee and Reynolds (2017) will be an approximately full-step update; (ii) there is no efficient way to compute the last inflation factor in the method proposed by Emerick (2019); (iii) as noted by Iglesias (2015) and Rafiee and Reynolds (2017), the regularization parameter of Hanke (1997) usually decreases with the iterations; therefore, we may simulate this decreasing behavior in a geometric progression, computing the first and last inflation factor using the formula derived by Rafiee and Reynolds (2017) a priori; (iv) finally, the method of Hanke (1997) has been proven to be of optimal order for Levenberg–Marquardt algorithms (Hanke, 2010). As it has a similar form as the ES-MDA update equation (Le et al., 2016, Rafiee and Reynolds, 2017), we conjecture that it may produce better outcomes than the discrepancy principle for generating ES-MDA inflation factors.

The paper is organized into six parts. Sections 2 Regularization for nonlinear inverse problems, 4 Ensemble smoother with multiple data assimilation present theoretical background of nonlinear inverse problems and a brief formulation for the ES-MDA. Section 3 displays the formulation of the objective function used in history matching problems. Section 5 presents the methods of Rafiee and Reynolds (2017) and Emerick (2019). In Section 6, we describe the proposed method to generate the inflation factors for the ES-MDA and discuss its efficiency. Section 7 presents the results obtained by running the ES-MDA using the simplest implementation, the method of Rafiee and Reynolds (2017), the method of Emerick (2019), and the one proposed by this study in a synthetic two-dimensional waterflooding problem.

Section snippets

Regularization for nonlinear inverse problems

In this section, we present a brief theoretical background of regularization for nonlinear inverse problems, used to derive the studies of inflation factors generation for ES-MDA (Le et al., 2016, Rafiee and Reynolds, 2017, Emerick, 2019).

Consider a nonlinear problem of defining the vector xRNx given a vector of measurements yRNy. The relation between x and y is given by a nonlinear function. There is no guarantee of neither existence nor uniqueness of a solution to this problem. Thus, one

BayesIan formulation of history matching problem

We can define the probability density function (pdf) of the vector of model parameters m given a set of observations dobs through Bayesian statistics (Oliver et al., 2008): f(m|dobs)=f(dobs|m)f(m)f(dobs)=aL(m|dobs)f(m)where f(m) is the pdf of m; f(dobs) is the pdf of dobs; f(dobs|m) is the conditional pdf of dobs given m; L(m|dobs) is the likelihood function of m given dobs; and a is a normalizing constant. Assuming that both the model parameters and the model errors have Gaussian distribution,

Ensemble smoother with multiple data assimilation

In this section, we present the formulation of ES-MDA as proposed by Emerick and Reynolds (2013a). Also, the ES-MDA update equation is presented as a solution to the nonlinear inverse problem.

Previous works

In this section, we review the methods of Rafiee and Reynolds (2017) and Emerick (2019) for generating the inflation factors for the ES-MDA.

Proposed methodology

In this section, we present a new formulation of generating the inflation factors for ES-MDA. Rafiee and Reynolds (2017) used the scheme of Hanke (1997) to calculate the first inflation factor α1. However, the following ones are computed considering only Eq. (16). Emerick (2019) proposed a methodology to obtain these factors based on the selection of αNa. However, Emerick (2019) proposes no analytical procedure for efficiently computing αNa. Considering the remarks about the decay rate of the

Results and discussion

In this section, we present a comparison of the results obtained when implementing ES-MDA with data covariance inflation factors generated using the methods of Rafiee and Reynolds, 2017, Emerick, 2019, the simplest choice with αi=Na i=1,,Na, and the method proposed in this study. For simplification, we follow the nomenclature defined by Emerick (2019), where the method of Rafiee and Reynolds (2017) is referred to as ES-MDA-GEO1; the method of Emerick (2019) is referred to as ES-MDA-GEO2;

Summary and conclusions

In this study, we present a new procedure for generating inflations factors for the ES-MDA. The new method uses the analytical formula developed in the study of Rafiee and Reynolds (2017) to compute both the first and the last ES-MDA inflation factors. The other inflation factors are generated geometrically in decreasing order. As a result, the method computes the correct number of assimilations that produces inflation factors such that the sum of their inverse is equal to one, as required by

CRediT authorship contribution statement

Thiago M.D. Silva: Wrote the manuscript, Developed the method. Sinesio Pesco: Supervised this work, Contributed to the discussion, Helped in the written part. Abelardo Barreto Jr.: Supervised this work, Contributed to the discussion, Helped in the written part. Mustafa Onur: Supervised this work, Contributed to the discussion, Helped in the written part.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

Authors are grateful to Petrobras for partially sponsoring this research. This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001. The authors are also grateful for TUPREP. The first author is grateful to Renan V. Bela for his valuable comments that improved this work. Some portions of this study were completed while the first author was a visiting Ph.D. student at the University of Tulsa from January 9th, 2020

References (29)

  • EmerickA.A. et al.

    Combining sensitivities and prior information for covariance localization in the ensemble Kalman filter for petroleum reservoir applications

    Comput. Geosci.

    (2011)
  • EmerickA.A. et al.

    Investigation of the sampling performance of ensemble-based methods with a simple reservoir model

    Comput. Geosci.

    (2013)
  • EvensenG.

    The ensemble Kalman filter: theoretical formulation and practical implementation

    Ocean Dyn.

    (2003)
  • HankeM.

    A regularizing Levenberg–Marquardt scheme, with applications to inverse groundwater filtration problems

    Inverse Problems

    (1997)
  • Cited by (9)

    View all citing articles on Scopus
    View full text