Elsevier

Computers & Geosciences

Volume 36, Issue 9, September 2010, Pages 1200-1209
Computers & Geosciences

CXTFIT/Excel–A modular adaptable code for parameter estimation, sensitivity analysis and uncertainty analysis for laboratory or field tracer experiments

https://doi.org/10.1016/j.cageo.2010.01.013Get rights and content

Abstract

We implemented the widely used CXTFIT code in Excel to provide flexibility and added sensitivity and uncertainty analysis functions to improve transport parameter estimation and to facilitate model discrimination for multi-tracer experiments on structured soils. Analytical solutions for one-dimensional equilibrium and nonequilibrium convection dispersion equations were coded as VBA functions so that they could be used as ordinary math functions in Excel for forward predictions. Macros with user-friendly interfaces were developed for optimization, sensitivity analysis, uncertainty analysis, error propagation, response surface calculation, and Monte Carlo analysis. As a result, any parameter with transformations (e.g., dimensionless, log-transformed, species-dependent reactions, etc.) could be estimated with uncertainty and sensitivity quantification for multiple tracer data at multiple locations and times. Prior information and observation errors could be incorporated into the weighted nonlinear least squares method with a penalty function. Users are able to change selected parameter values and view the results via embedded graphics, resulting in a flexible tool applicable to modeling transport processes and to teaching students about parameter estimation. The code was verified by comparing to a number of benchmarks with CXTFIT 2.0. It was applied to improve parameter estimation for four typical tracer experiment data sets in the literature using multi-model evaluation and comparison. Additional examples were included to illustrate the flexibilities and advantages of CXTFIT/Excel. The VBA macros were designed for general purpose and could be used for any parameter estimation/model calibration when the forward solution is implemented in Excel. A step-by-step tutorial, example Excel files and the code are provided as supplemental material.

Introduction

CXTFIT is widely used for tracer experiment data analysis to investigate transport of solutes in the subsurface. The original FORTRAN version (Parker and van Genuchten, 1984b) reads ASCII data files with observations, initial parameter guesses, and control variables. Parameters for equilibrium or nonequilibrium convection dispersion models (CDE or MIM) are estimated using a nonlinear least squares method. Optimized values, standard errors, and correlations for parameters and predicted concentrations are output for the fitted model. Toride et al. (1995) added solutions for first order decay and zero order production for MIM, nonuniform initial conditions for CDE and MIM and developed a user-friendly interface. Nützmann et al. (2005) implemented Visual CXTFIT in Microsoft Excel with similar input/output structure and functions.

The above versions of CXTFIT have a number of limitations. First, the parameters that can be estimated are hard-wired and are not always convenient to the experiment or study objectives. For example, average pore velocity v, dispersion coefficient D, retardation factor R, and pulse volume T0, are invertible parameters in existing CXTFIT. However, these parameters are not independent, e.g., v=q/θ, D=λv, R=1+ρbkd/θ and To=qto/θL where q is Darcy velocity, θ is effective porosity, λ is dispersivity, ρb is bulk density, kd is partition coefficient, and to is pulse duration. All of the invertible parameters are dependent on θ and model predictions are often very sensitive to θ (e.g., Tang et al., 2009). If θ is not accurately known, or is suspected to be different from the measured total porosity, inversion results may be inaccurate. Because sensitivity is essential for the identification of transport mechanisms and the estimation of parameters, and it is useful for experimental design and the diagnosis of misfit and overfit (Tang et al., 2009), the option to output sensitivity is necessary.

Second, CXTFIT was designed to estimate one set of parameters for one solute at a time. Take Fig. 7.9 in Toride et al. (1995) for example, D and mobile water fraction θm were first estimated from 3H2O data to be D=15.57 cm2 d−1 and θm/θ=0.823 with a 95% confidence interval of 7.129–24.01 cm2 d−1 and 0.760–0.886, respectively. The D and θm/θ estimates were assumed to be identical for 3H2O and boron (B) and were subsequently used to estimate the mass transfer coefficient ω and fraction of equilibrium sorption sites (f) of B. While this logic is commonly used and may simplify parameter estimation for B, the impact of the uncertainty in D and θm on the estimate of ω and f for B is ignored. Fitting data for both 3H2O and B simultaneously using the same assumptions (identical D and θm for 3H2O and B) might be preferable because the uncertainties in all the parameters could be quantified (Tang et al., 2009). Such a strategy requires more flexibility in setting up the parameter estimation problem than what is currently possible in CXTFIT.

Third, CXTFIT uses an ordinary nonlinear least squares method based on the assumption of normally distributed error with identical variance for all observations. In practice, measurement errors may differ for different concentration levels or for different chemicals in the same experiment. Parameters in CXTFIT are assumed to be either known or unknown and hence they are either fixed or estimated (with or without constraints) in the inverse solution. A weighted least squares formulation with a penalty function can incorporate observation uncertainties and prior parameter information to improve parameter estimation. Data and/or parameter transformations can also make the optimization faster and more stable (Doherty, 2004). This flexibility is useful when many parameters are estimated and/or when uncertainty in a few parameters is significant.

Finally, the nonlinear squares method in CXTFIT is solved by a gradient method for efficiency and the estimated parameter confidence intervals are based on a linear approximation assumption. When many parameters are estimated, the results can be dependent on the initial guess and the optimization can be trapped in a local minimum, resulting in inaccurate confidence intervals and sensitivities. In these cases, it is desirable to check the response surface to examine the correlation of two or three highly correlated parameters (Hill and Tiedeman, 2007) and observe the shape and range of the confidence intervals of parameters (Press et al., 1992). In the case of many parameters, Monte Carlo analysis can also help determine if the optimized value is a global minimum and to analyze global sensitivity and uncertainty of parameters with respect to model fits.

The objective of this work is to implement a version of CXTFIT in Excel to achieve greater flexibility in setting up parameter estimation problems and to provide additional tools for sensitivity analysis, uncertainty analysis, model evaluation, and model comparison. Unlike Visual CXTFIT by Nützmann et al. (2005) and CXTFIT by Parker and van Genuchten (1984b) and Toride et al. (1995), we implement the direct solutions as math functions so that users can easily set up direct and inverse problems according to individual needs and preferences. We add functions with user-friendly interfaces for sensitivity analysis, error propagation, response surface calculation, and Monte Carlo analysis and implement the weighted least squares method with a penalty function to incorporate observation error and parameter prior information in the inversion. Therefore, CXTFIT/Excel has far greater flexibility and more uncertainty analysis functions than previous versions. This was shown to be important for improving data interpretation for increasingly complex column experiments (Tang et al., 2009). While our previous work focused on using multi-model evaluation and comparison to improve column experimental data interpretation, this work focuses on implementation and application of CXTFIT/Excel as a tool. Additional examples from the literature are analyzed in this work to illustrate the potential advantages of CXTFIT/Excel.

The code is implemented in Microsoft Excel 2003 under Microsoft Windows XP Professional and verified by a number of benchmarks in CXTFIT 2.0 (Toride et al., 1995). The source code is available together with a step-by-step tutorial to set up and use CXTFIT/Excel from scratch (“Tutorial”). Excel files for these benchmarks, other examples, and blank Microsoft Excel files for both Excel 2003 and 2007 are also provided. Issues related to the two versions of Excel are also addressed in the Tutorial.

Section snippets

Theory

The CDE model may be written as (Parker and van Genuchten, 1984b)Rct=D2cx2vcxμc+γsubject to specified initial and boundary conditions where c is the resident concentration in the pore water [ML−3], μ is a first order decay coefficient or irreversible adsorption coefficient, [T−1], γ is a production rate [ML−3T−1], x is the distance from the inlet, and t is the time. Analytical solutions are given by Parker and van Genuchten (1984b).

The dimensionless MIM is (Toride et al., 1995)βRCmT=1P

Model setup

As described by Wraith and Or (1998) and illustrated in Fig. 1, setting up a parameter estimation problem in a spreadsheet involves input of parameters, observations, and formulae. The parameters may either be fixed (Step 1) or estimated with or without constraints (Step 2). The observations are at various times and/or locations (Step 3). Formulae are required for direct model predictions, residuals (Step 4), objective function (Step 5), and/or parameter confidence intervals (Step 6). Of note

Applications

CXTFIT/Excel was used to solve a number of benchmark problems in CXTFIT 2.0 (Toride et al., 1995) to verify the code and to provide examples to facilitate transition for CXTFIT to CXTFIT/Excel (see Tutorial). The results are very close for both forward and inverse solutions. CXTFIT/Excel was also applied to analyze four column experiment data sets for multiple model choices and parameterizations in Tang et al. (2009) to illustrate how multi-model evaluation and comparison can help in data

Summary

The analytical solutions for the one-dimensional equilibrium and nonequilibrium convection dispersion equations (CDE and MIM) were coded as VBA functions, which may be used as ordinary math functions for the forward solutions in Excel. Objective function minimization, sensitivity analysis, uncertainty analysis, error propagation, response surface calculation, and Monte Carlo analysis functions were programmed into VBA subroutines, which can be used as macros. User-friendly interfaces were

Acknowledgements

We appreciate the review comments provided by Dr. Philip Meyer and another anonymous reviewer, which significantly helped improve this paper. This research was funded by the U.S. Department of Energy, Office of Science, Office of the Biological and Environmental Research. Oak Ridge National Laboratory is managed by the University of Tennessee-Battelle, LLC, under contract DE-AC05-00OR22725 with the US DOE.

References (14)

There are more references available in the full text version of this article.

Cited by (60)

  • Experiments and simulation of co-migration of copper-resistant microorganisms and copper ions in saturated porous media

    2021, Journal of Contaminant Hydrology
    Citation Excerpt :

    Three steps were adopted to verify the reliability of model simulation results. CXTFIT software was used to obtain the pore water flow velocity V(=krw) and the dispersion coefficient D based on the tracer ion breakthrough curve (Tang et al., 2010; Toride, 1995). The adsorption analytical model and the internal optimization process of HYDRUS-1D were used to analyze the microbial breakthrough curve to estimate the characteristics of water flow and microbial migration (Simunek, 2005).

View all citing articles on Scopus

Code available from server at http://www.iamg.org/CGEditor/index.htm.

View full text