Streamflow rating uncertainty: Characterisation and impacts on model calibration and performance

https://doi.org/10.1016/j.envsoft.2014.09.011Get rights and content

Highlights

  • Residual heteroscedasticity for many streamflow gauging stations in Australia.

  • Censoring extrapolated flows reduced model performance in regional calibration.

  • Calibrating parameters against uncensored data fared better in prediction mode.

  • Weighting on uncertain flows using a modified NSE improved calibration/prediction.

  • Weighting on uncertain flows also reduced parameter uncertainty.

Abstract

Common streamflow gauging procedures require assumptions about the stage-discharge relationship (the ‘rating curve’) that can introduce considerable uncertainties in streamflow records. These rating uncertainties are not usually considered fully in hydrological model calibration and evaluation yet can have potentially important impacts. We analysed streamflow gauge data and conducted two modelling experiments to assess rating uncertainty in operational rating curves, its impacts on modelling and possible ways to reduce those impacts. We found clear evidence of variance heterogeneity (heteroscedasticity) in streamflow estimates, with higher residual values at higher stage values. In addition, we confirmed the occurrence of streamflow extrapolation beyond the highest or lowest stage measurement in many operational rating curves, even when these were previously flagged as not extrapolated. The first experiment investigated the impact on regional calibration/evaluation of: (i) using two streamflow data transformations (logarithmic and square-root), compared to using non-transformed streamflow data, in an attempt to reduce heteroscedasticity and; (ii) censoring the extrapolated flows, compared to no censoring. Results of calibration/evaluation showed that using a square-root transformed streamflow (thus, compromising weight on high and low streamflow) performed better than using non-transformed and log-transformed streamflow. Also, surprisingly, censoring extrapolated streamflow reduced rather than improved model performance. The second experiment investigated the impact of rating curve uncertainty on catchment calibration/evaluation and parameter estimation. A Monte-Carlo approach and the nonparametric Weighted Nadaraya-Watson (WNW) estimator were used to derive streamflow uncertainty bounds. These were later used in calibration/evaluation using a standard Nash-Sutcliffe Efficiency (NSE) objective function (OBJ) and a modified NSE OBJ that penalised uncertain flows. Using square-root transformed flows and the modified NSE OBJ considerably improved calibration and predictions, particularly for mid and low flows, and there was an overall reduction in parameter uncertainty.

Introduction

Streamflow data are generally estimated from stage measurements through a stage–discharge relationship (the ‘rating curve’), developed through measurement of flow using manual methods (estimation of flow velocity combined with estimates of river width and height for subsections of the river) and relating that to measured flow height at various points in time; then interpolation/extrapolation of that relationship across all height-flow levels using regression techniques to produce a curve. Several sources of uncertainty can be accounted for in this procedure including measurements of flow height, width and shape of the river cross-section and inaccuracies in the measurement of the velocity–area relationship (Domeneghetti et al., 2012). Another source of uncertainty arises from the regression techniques used to derive the stage–discharge relationship. The classical approach for deriving stage-discharge (rating) relationship involves fitting a curve for (log-transformed) discrete rating measurements using (non)linear least squares. This implicitly assumes that the measurement residuals have a normal distribution and are unrelated to the expected discharge (Petersen-Øverleir, 2004). Residuals for existing curves often show non-normal distributions (e.g. Tomkins and Davidson, 2011) with higher residual values at higher stage values (heteroscedasticity). Scarce sampling and heteroscedasticity observed in streamflow residuals may introduce large uncertainty in streamflow estimates based on extrapolation of the rating curve (Westerberg et al., 2011). These streamflow observations are the core data used to calibrate hydrological models.

Objective functions (OBJs) are used in calibration to minimise the differences between observed and modelled streamflow and also to assess the model performance under prediction. Traditionally, the minimisation is performed against the sum-of-squared residuals under the assumption that these residuals are homoscedastic in nature (i.e. there is no variance heterogeneity in the streamflow data). This assumption is often not valid for streamflow data (Petersen-Øverleir, 2004) and its violation may overestimate goodness-of-fit metrics used in simulations (McMillan et al., 2010). Moreover, routinely used OBJs in calibration, for example the Nash-Sutcliffe Efficiency (NSE, Nash and Sutcliffe, 1970), place high weights on high flows which may be extrapolated, thus potentially biasing predictions (Croke, 2007).

In this paper, we investigate the impact of streamflow rating uncertainty on hydrological model calibration and performance (i.e. ‘prediction’ using streamflow data from catchments not used for model calibration or split-sample ‘evaluation’ using streamflow data from a period not used for model calibration). Firstly, we use a comprehensive hydrometric dataset of 65 streamflow gauges (described in Section 2) to assess the occurrence of heteroscedasticity and extrapolation in rating curves (Section 2.1). Secondly, we conduct two types of experiments:

  • (i)

    The first experiment makes use of the entire streamflow dataset (65 streamflow time-series) to assess the impacts of including uncertain extrapolated streamflow data in a regional calibration/prediction experiment (Section 2.2). Several methods were trialled to address this problem; from censoring all extrapolated high flows to using streamflow Box-Cox transformation (Box and Cox, 1964, Bennett et al., 2013) in an attempt to reduce heteroscedasticity. For this experiment, we calibrated a single parameter set (n = 28) of the process-based landscape water balance model Australian Water Balance Assessment system Landscape model (AWRA-L) (van Dijk, 2010, van Dijk and Renzullo, 2011, Vaze et al., 2013) in 33 of the 65 stations and performance was independently evaluated for the remaining 32 stations. We assessed the impact of censoring high flows on the NSE compared to no censoring. We repeated the experiment using two streamflow transformations (logarithmic and square-root). The regional calibration (i.e. a single set of parameters to predict flows in a large geographical domain) was chosen for methodological and practical reasons. Firstly, predictions of streamflow and other fluxes and stores (e.g. evapotranspiration and soil water) are required in many ungauged basins with dissimilar climate and biophysical characteristics; a regional calibration using a large amount of catchment streamflow data might yield better results than parameter regionalisation techniques (Parajka et al., 2007, Vaze and Teng, 2011). Secondly, AWRA-L has been regionally calibrated (using a similar approach as in this paper) against Australian streamflow and evapotranspiration data; producing results that markedly improved compared to a previous non-calibrated version (version 1.0 vs. 0.5; Viney et al., 2011). The calibration results were also similar to results from locally calibrated conceptual models, showing that AWRA-L can capture the different climatic and biophysical characteristics that affect streamflow (Viney et al., 2011). Thirdly and finally, AWRA-L is currently used operationally to provide information on water fluxes and stores across Australia and its being continuously refined.

  • (ii)

    The second experiment investigates the impact of rating curve uncertainty on the NSE and parameter estimation in a local calibration/evaluation (Section 2.3) using a Monte-Carlo approach and the nonparametric Weighted Nadaraya-Watson (WNW) estimator. We use these methods for quantifying the error in the rating curve because they capture changes in the rating curve with time, they are nonparametric and make minimal assumptions about the probabilistic distribution of the data. We employed them to derive rating curve uncertainty bounds for 100 streamflow realisations. To interpret impacts on parameter space, we calibrated 4 parameters of the simpler conceptual rainfall-runoff model GR4J (compared to the 28-parameter AWRA-L) (Perrin et al., 2003). These were later used in split-sample calibration/evaluation in a single station using a standard NSE OBJ and a modified NSE OBJ, which used the uncertainty bounds to penalise uncertain flows. Again, we repeated the experiment using logarithmic and square-root streamflow transformations.

The data and methods are described in Section 2. The results of the experiments are presented and analysed in Section 3, the findings are discussed in Section 4 and finally conclusions are drawn (Section 5).

Section snippets

Data and methods

The New South Wales (NSW) Office of Water (NoW) in Australia regularly republishes the ‘Pinneena’ water database on DVD (http://waterinfo.nsw.gov.au/pinneena/gw.shtml). The version used here (December, 2009) includes 127,000 years of daily streamflow information from 1400 stations. The database includes records of hydraulic control type (including concrete structures, rocky river bed not reinforced with concrete, gravel or sand river bed), stage height, rating tables, interpolation method,

Heteroscedasticity and extrapolation in rating curves

Plots of stage vs. maximum-normalized residuals between rating measurements and the rating curve were visually examined for evidence of heteroscedasticity. The classification resulted in 29 stations (45%) showing the characteristic trumpet shape; these were classified as type A (moderate to strong evidence of heteroscedasticity). Another 14 (21%) were classified as type B (none to slight evidence of heteroscedasticity), and finally 22 (34%) stations were classified as type C (inconclusive

Heteroscedasticity and extrapolation in rating curves

This paper ascertained the occurrence of residual heteroscedasticity in 65 streamflow stations located in southeast Australia. For the station studied here, rating curves are derived using piecewise regressions of a logarithmic or linear form. Upon visual inspection, residuals between measured streamflow and values estimated with these curves showed evidence of heteroscedasticity in 29 stations, with higher residual values at higher stage values. Results were inconclusive in 22 stations, mainly

Conclusion

Analysis of stage-discharge data and operational for 65 unregulated catchments in New South Wales, Australia, revealed the incidence of higher residual values at higher stage values (heteroscedasticity) in the stage–discharge relationship (the ‘rating curve’). In addition, streamflow quality codes were not always reliable, with an important number of extrapolated flows (both in the high and low flow range) remaining in the streamflow data although they were deemed of ‘good’ quality (i.e. not

Acknowledgements

This work is part of the water information research and development alliance between the Bureau of Meteorology and CSIRO's Water for a Healthy Country Flagship. Kerry Tomkins, Q. J. Wang, Ming Li and David Robertson from CSIRO Land and Water Flagship, Warren Jin from CSIRO Mathematics, Informatics and Statistics (CMIS) and Sri Srikanthan from the Bureau of Meteorology are also thanked for providing valuable comments on the research. We would also like to acknowledge the valuable comments of

References (43)

  • B.F.W. Croke

    Representing uncertainty in objective functions: extension to include the influence of serial correlation

  • G. Di Baldassarre et al.

    Is the current flood of data enough? A treatise on research needs for the improvement of flood modelling

    Hydrol. Process.

    (2012)
  • A. Domeneghetti et al.

    Assessing rating-curve uncertainty and its effects on hydraulic model calibration

    Hydrol. Earth Syst. Sci. Discuss.

    (2012)
  • P. Hall et al.

    Methods for estimating a conditional distribution function

    J. Am. Stat. Assoc.

    (1999)
  • R.D. Harmel et al.

    Consideration of measurement uncertainty in the evaluation of goodness-of-fit in hydrologic and water quality modeling

    J. Hydrol.

    (2007)
  • T. Hayfield et al.

    Nonparametric Econometrics: the np Package

    J. Stat. Softw.

    (2008)
  • B. Hrafnkelsson et al.

    Modeling discharge rating curves with Bayesian B-splines

    Stoch. Environ. Res. Risk Assess.

    (2012)
  • Joint Flood Taskforce

    Joint Flood Taskforce Report. Queensland Flood Commission of Inquiry

    (2011)
  • V. Klemes

    Operational testing of hydrological simulation-models

    Hydrol. Sci. J.-J. Des Sci. Hydrol.

    (1986)
  • N. Le Moine et al.

    How can rainfall-runoff models handle intercatchment groundwater flows? Theoretical study based on 1040 French catchments

    Water Resour. Res.

    (2007)
  • J. Lerat

    Do internal flow measurements improve the calibration of rainfall-runoff models?

    Water Resour. Res.

    (2012)
  • Cited by (35)

    • Improving rating curve accuracy by incorporating water balance closure at river bifurcations

      2022, Journal of Hydrology
      Citation Excerpt :

      These gauged discharges are often based on measured flow velocities and cross-sectional geometry. Rating curve errors can lead to significant errors in the analyses in which the discharge records are used, such as flood frequency analysis (Lang et al., 2010; Steinbakk et al., 2016), hydraulic or hydrological model calibration (Domeneghetti et al., 2012; Peña Arancibia et al., 2014; Sikorska and Renard, 2017) and flood forecasting (Ocio et al., 2017). The largest errors in derived discharges arise when the rating curve is used during extremely high flow conditions (Domeneghetti et al., 2012; Pappenberger et al., 2006).

    • Copula-based probabilistic spectral algorithms for high-frequent streamflow estimation

      2020, Remote Sensing of Environment
      Citation Excerpt :

      A brief summary of remote sensing-based discharge estimation using single and multi-satellite based optical sensors is given in Table 1. As river discharge cannot be directly measured at ground station (Peña-Arancibia et al., 2015; Tomkins, 2014), traditional formulas are used for discharge estimation using the measured hydraulic variables (velocity, stage) and coefficients. In this context, Bjerklie et al. (2003) attempted to derive information on hydraulic variables (viz., water-surface width, water-surface elevation, and surface velocity) of the river from remote sensing data to compute discharge.

    • Characterizing water surface elevation under different flow conditions for the upcoming SWOT mission

      2018, Journal of Hydrology
      Citation Excerpt :

      In most countries, it is relatively sparsely monitored by means of ground-based stations, which measure the water surface height (referenced to some local datum) and estimate the river flows by means of rating curves. The result is a largely incomplete knowledge of river fluxes, with measurements provided by stream gauge networks of different density, accuracy and reliability over the globe (Biancamaria et al., 2010; Pavelsky et al., 2014; Wilson et al., 2015; Pena-Arancibia et al., 2015; Tomkins, 2014; Domeneghetti et al., 2012). The installation and maintenance costs required to sustain the monitoring networks constrain their installation mostly to highly-developed areas.

    View all citing articles on Scopus
    View full text