Use of neurofuzzy networks to improve wastewater flow-rate forecasting

doi:10.1016/j.envsoft.2008.10.010

Environmental Modelling & Software

Volume 24, Issue 6, June 2009, Pages 686-693

https://doi.org/10.1016/j.envsoft.2008.10.010 Get rights and content

Abstract

A neurofuzzy wastewater flow-rate forecasting model (NFWFFM) has been developed and tested with actual data measured at the input of two wastewater treatment facilities which treat the wastewater corresponding to 150,000 and 1,250,000 p.e., respectively. Good agreements between forecasted and actual flow-rates were obtained. The artificial intelligence algorithm uses only two input variables (day of the week and average daily flow-rate of day before) and one output variable (predicted average daily flow-rate). Using three months data for training the network, a long-term forecast (one month) is made with average errors below 10%. Results were compared with those obtained by applying the Census Method II (a commonly used decomposition/recomposition time series method) observing that forecast made by the NFWFFM is more accurate than the one made by this commonly used statistical method.

Introduction

During the recent years, a large number of control algorithms have been developed for controlling the biological processes of wastewater treatment facilities (WWTFs). Probably, the more significant one is the predictive model-based controller (PMBC). This algorithm uses computer simulation to predict the behaviour of a facility and later, with this information, modifies the inputs of the process to optimise the operational results and/or costs. In order to use properly a predictive model-based controller in a WWTF, a forecasting tool that let the computer to know future flow-rates and pollutant loads, is needed. The most frequently used forecasting methods are decomposition/recomposition of time series models (BLS, Census Methods, etc), auto regressive moving average (ARMA) models and auto regressive integrated moving average (ARIMA) models (Koutroumanidis et al., 2006). In addition, new models based on artificial intelligence are used as predictive models, especially neural networks (NNs). These techniques have been compared with the traditional ones showing significant improvements (White, 1988, Wong et al., 1992, Weigend et al., 1992, Tamada et al., 1993, Onkal-Engin et al., 2005, Raduly et al., 2007, Al-Alawi et al., 2008), which makes their application in the PMBC very interesting. In the same way that NN, fuzzy logic (another branch of the artificial intelligence) has important characteristics that make it suitable for different uses (Yong et al., 2006, Barreto-Neto and de Souza Filho, 2008). Moreover, neural networks and fuzzy logic share the common ability to deal with difficulties arising from uncertainty, imprecision, and noise in a natural environment. In this context, learning capabilities of NN, and as a consequence their potential use as universal approximators (Hornik et al., 1989, Poggio and Girosi, 1990), is the main factor in their selection as forecasting methods. About fuzzy systems, they are also used as approximators, since they can easily approximate any continuous function on a compact set (Kosko, 1992).

In order to get both, the benefits of neural networks and fuzzy logic systems, avoiding at the same time their respective problems, many authors proposed to combine them into an integrated system, such that the low level-learning and computation power of neural networks can be implemented into the fuzzy logic systems, and also, provide the high-level human-like thinking and reasoning of fuzzy logic systems into the neural networks. These systems are called fuzzy neural network (FNN) and they have been widely used in many different applications during the last decades (Enbutsu et al., 1993, Tanaka et al., 1995, Hiraga et al., 1995, Gobi and Pedrycz, 2006, Luo et al., 2007, Modi et al., 2008).

The main objective of this work has been to use FNN to model the influent flow-rates in different WWTFs. The input data used consisted of the influent flow-rate to the WWTFs during the previous day and the day of the week. The single output consists of the influent flow-rate forecast. Daily data during 3 months were used for training the network (parameter estimation) whereas a different set of flow-rate data (which contains the flow-rates of one month) was used to test the generalisation ability of the FNN at various stages of learning (model validation). The Performance Index (P.I.) as defined by Lin and Cunningham (1995) was used to evaluate the generalisation ability of the FNN, being the cross-validation criterion (Amari et al., 1997) used to determine the best moment to stop the learning process.

In order to increase the credibility and impact of the model presented here, the 10 iterative steps for good, disciplined development of models (Jakeman et al., 2006) were taken into account. Thus, the objective proposed for the model was not only to reproduce the observed flow-rates but also to be a first stage in the development of a procedure to obtain rules with a physical meaning from raw flow-rate data. These rules would be subsequently used in a PMBC in order to enhance the WWTF control. Because of their final users, personnel of WWTFs, one of the constrains of the model is that it should not be too complex. Taking this in mind, the topology of the network used corresponded to the three stages in the development of a fuzzy system: fuzzification, rules and defuzzification. This topology was selected because it allowed to consider the neural network not as a black box model but as a model that can obtain rules with a physical meaning from raw flow-rate data. The parameter estimation was obtained by applying two learning algorithms, a self-organised and a back-propagation one and the train a validation of the model was carried out using experimental observations in the WWTFs. In addition, it is important to bear in mind that the model proposed is not the final step of our research line but just a stage in a more ambitious program that aims to:

•
Develop a procedure that allows to obtain meaningful rules (expert system) from flow-rate data without the direct cooperation of human thinking (primary fundamental objective). In a next step (future work) this will allow to simplify the design of expert systems to control of WWTP.
•
Develop a procedure that can be implemented in a supervisory control scheme particularly inside a model predictive controller for optimizing the performance of WWTF (primary practical objective).

For this reason, and in order to avoid any problem due to software connectivity a home-made code was developed and used instead of using a commercial software package.

Section snippets

Formulation of the neurofuzzy wastewater flow-rate forecasting tool

The objective of this neural network is not only to model flow-rates (to be used in a PMBC) but to try to use the topology of the network to obtain rules with a physical meaning. Obviously, this work is a first step and the topology of the network has been specially designed to accomplish this objective. This allows to explain the topology of the FNN used and some of the assumptions taken into account on the model formulation (number of layers, neurons, etc.). In order to define the neural

Training and validation of the wastewater flow-rate forecasting FNN

One important point to be noticed is that the number of data for training and validating the FNN should be small because the FNN is developed for controlling processes and long-term changes should not be fed to the model because they can lead to wrong control actions. Taking this into account, a period of four months was considered good for training and validation of the model (3 months for training and 1 month for validation).

Conclusions

From this work the following conclusions can be drawn.

•
Fuzzy neural networks are a suitable method for forecasting urban wastewater flow-rates.
•
Using only two input variables and small number of neurones, average errors below 10% (expressed as PI) can be obtained in the foresight of urban wastewater flow-rates. Maximum errors are always under 22% (expressed as relative error).
•
The artificial intelligence model cannot predict flow-rates under or over the values contained in the training data set,

References (25)

S. Al-Alawi et al.
Combining principal component regression and artificial neural networks for more accurate predictions of ground-level ozone
Environmental Modelling & Software
(2008)
A. Barreto-Neto et al.
Application of fuzzy logic to the evaluation of runoff in a tropical watershed
Environmental Modelling & Software
(2008)
A.F. Gobi et al.
The potential of fuzzy neural networks in the realization of approximate reasoning engines
Fuzzy Sets and Systems
(2006)
K. Hornik et al.
Multilayer feedforward networks as universal aproximators
Neural Networks
(1989)
A.J. Jakeman et al.
Ten iterative steps in development and evaluation of environmental models
Environmental Modelling & Software
(2006)
T. Koutroumanidis et al.
Time-series modeling of fishery landings using ARIMA models and fuzzy expected intervals software
Environmental Modelling & Software
(2006)
X. Luo et al.
A fuzzy neural network model for predicting clothing thermal comfort
Computers and Mathematics with Applications
(2007)
P.K. Modi et al.
Fuzzy neural network based voltage stability evaluation of power systems with SVC
Applied Soft Computing Journal
(2008)
G. Onkal-Engin et al.
Determination of the relationship between sewage odour and BOD by neural networks
Environmental Modelling & Software
(2005)
B. Raduly et al.
Artificial neural networks for rapid WWTP performance evaluation: methodology and case study
Environmental Modelling & Software
(2007)

M.A. Yong et al.

Intelligent control aeration and external carbon addition for improving nitrogen removal

Environmental Modeling & Software

(2006)

S. Amari et al.

Asymptotic statistical theory of overtraining and cross-validation

IEEE Transactions on Neural Networks

(1997)

Cited by (49)

Dynamic nonlinear effects of urbanization on wastewater discharge based on inertial characteristics of wastewater discharge
2023, Science of the Total Environment
This study examines the impact of urbanization on wastewater discharge (WD) in 30 provinces in mainland China, considering the inertia characteristics of WD. Various models, including the Tapio decoupling model, dynamic curve relationship model, dynamic threshold effect model, and dynamic quantile model, are employed to analyze the decoupling relationship, curve relationship, threshold relationship, and quantile relationship, respectively. The research findings indicate a shift in the relationship between urbanization and total wastewater discharge (TWD) from expansionary negative decoupling to strong decoupling. Regarding household wastewater discharge (HWD), the relationship is primarily characterized by expansionary negative decoupling and weak decoupling, while industrial wastewater discharge (IWD) is mainly associated with strong decoupling. Urbanization does not exhibit an (inverted) N-shaped relationship with TWD, IWD, and HWD, but it does show an inverted U-shaped relationship with TWD and HWD. The study also reveals that urbanization has a dynamic threshold effect and regional heterogeneity on HWD, but not on TWD and IWD. As the quantile increases, the positive impact of urbanization on TWD and HWD decreases, while the negative impact on IWD increases.
Data to intelligence: The role of data-driven models in wastewater treatment
2023, Expert Systems with Applications
Increasing energy efficiency in wastewater treatment plants (WWTPs) is becoming more important. An emerging approach to addressing this issue is to exploit development in data science and modelling. Deployment of sensors to measure various parameters in WWTPs opens greater opportunities for exploiting the wealth of data. Artificial intelligence (AI) is emerging as a solution for automation and digitalization in the wastewater sector. This review aims to comprehensively investigate, summarize and analyze recent developments in AI methods applied to the modelling of WWTPs. The review shows that among the standalone models, Artificial Neural Networks (ANN) was the most popular model followed by, in descending order: Decision Trees (DT), Fuzzy Logic (FL), Genetic algorithm (GA) and Support Vector Machine (SVM). In the case of incomplete data, FL was the most frequently used method as it uses linguistic expert rules to find an approximation for the missing data. Regarding accuracy and precision, hybrid models demonstrated relatively better performance than the standalone ones. Among these models, the Machine Learning (ML)-metaheuristic, which integrates an AI model with a bioinspired optimization method, was the most preferred type as it was used in more than 45% of the hybrid models. Correlation coefficient (R), Correlation of Determination (R²) and Root Mean Square Error (RMSE) were the frequently used metrics for model performance evaluation. Finally, the review shows that despite recent developments, industrial deployment is still lacking. The industrial application requires close interaction of interested parties, among which research institutes, private sector and public sector play an inevitable role. The future research should focus on mitigating the barriers for more in-depth collaboration of interested parties and finding new paths for more cooperative and harmonized activity of them.
Optimal control of sewage treatment process using a dynamic multi-objective particle swarm optimization based on crowding distance
2023, Journal of Environmental Chemical Engineering
A variety of multi-objective optimization algorithms has been extensively investigated in the past decades to tackle the optimal decision of sewage treatment process. However, achieving the ideal solutions is challenging in the multi-criteria decision-making process from Pareto optimal sets due to the complicated relationships among influencing factors, especially in the case of large decision variables involved in the wastewater treatment process. We thus proposed an improved dynamic multi-objective particle swarm optimization algorithm based on crowding distance (DMOPSO-CD) to obtain global optimal solutions for the balance between energy consumption (EC) and effluent quality (EQ) in sewage treatment processes. The algorithm consists of optimization modules and a self-organizing fuzzy neural network, improving the global searching ability of particles, maintaining the diversity of non-inferior solutions, and solving the multi-objective vital issues in the optimization of sewage treatment process. The proposed optimization algorithm was applied to benchmark simulation model No.1, and the optimization results showed that the EC for wastewater treatment in dry, rainy, and storm weather was reduced by 7.87%, 6.28%, and 7.30%, respectively. This methodology outperformed several widely applied algorithms, including the multi-objective cuckoo search, non-dominated sorting genetic algorithm-II, and improving Pareto evolutionary algorithm in terms of EQ and EC, which opens a new window for the optimal decision of sewage treatment.
Leveraging water-wastewater data interdependencies to understand infrastructure systems’ behaviors during COVID-19 pandemic
2022, Journal of Cleaner Production
Citation Excerpt :
We plotted the relationships between wastewater flow and previous wastewater flows across multiple lag periods—e.g., 1-day lag of flow (i.e., flow in the previous day), 2-day lag of flow—to identify the lag with the highest correlation. For our wastewater-flow time series, 1-day lag turned to be the best lagged flow determinant, aligning with the literature (Fernandez et al., 2009; Zhang et al., 2019). Further, we plotted the wastewater flow with respect to the various factors to determine possible types of relationships.
Social distancing policies (SDPs) implemented worldwide in response to COVID-19 pandemic have led to spatiotemporal variations in water demand and wastewater flow, creating potential operational and service-related quality issues in water-sector infrastructure. Understanding water-demand variations is especially challenging in contexts with limited availability of smart meter infrastructure, hindering utilities' ability to respond in real time to identified system vulnerabilities. Leveraging water and wastewater infrastructures' interdependencies, this study proposes the use of high-granular wastewater-flow data as a proxy to understand both water and wastewater systems’ behaviors during active SDPs. Enabled by a random-effects model of wastewater flow in an urban metropolitan city in Texas, we explore the impacts of various SDPs (e.g., stay home-work safe, reopening phases) using daily flow data gathered between March 19, 2019, and December 31, 2020. Results indicate an increase in residential flow that offset a decrease in nonresidential flow, demonstrating a spatial redistribution of wastewater flow during the stay home-work safe period. Our results show that the three reopening phases had statistically significant relationships to wastewater flow. While this yielded only marginal net effects on overall wastewater flow, it serves as an indicator of behavioral changes in water demand at sub-system spatial scales given demand-flow interdependencies. Our assessment should enable utilities without smart meters in their water system to proactively target their operational response during pandemics, such as (1) monitoring wastewater-flow velocity to alleviate potential blockages in sewer pipes in case of decreased flows, and (2) closely investigating any consequential water-quality problems due to decreased demands.
Adaptive soft sensing of river flow prediction for wastewater treatment operation and risk management
2022, Water Research
Many wastewater utilities have discharge permits directly tied with the receiving river flow, so it is critical to have accurate prediction of the hydraulic throughput to ensure safe operation and environment protection. Current empirical knowledge-based operation faces many challenges, so in this study we developed and assessed daily-adaptive, probabilistic soft sensor prediction models to forecast the next month's average receiving river flowrate and guide the utility operations. By comparing 11 machine-learning methods, extra trees regression exhibits desired deterministic prediction accuracy at day 0 (overall accuracy index: 3.9 × 10⁻³ 1/cms²) (cms: cubic meter per second), which also increases steadily over the course of the month (e.g., MAPE and RMSE decrease from 41.46% and 23.31 cms to 3.31% and 2.81 cms, respectively). The overall classification accuracy of three river flow classes reaches 0.79 at the beginning and increases to about 0.97 over the course of the predicted month. To manage the uncertainty caused by potential false negative classification as overestimations, a probabilistic assessment on the predictions based on 95% lower PI is developed and successfully reduces the false negative classification from 17% to nearly zero with a slight sacrifice of overall classification accuracy.
Forecasting China's wastewater discharge using dynamic factors and mixed-frequency data
2019, Environmental Pollution
Citation Excerpt :
This technique has been widely applied to research on global climate changes because it has larger advantages in portraying nonlinear characteristics. Similarly, it has been used to forecast the amount of wastewater discharge (Fernandeza et al., 2009) including methane emissions (Du et al., 2018) and performs well. Indeed, the above Grey model and neural networks utilizing the past change trends of wastewater discharge do not take relative factors into consideration.
Forecasting wastewater discharge is the basis for wastewater treatment and policy formulation. This paper proposes a novel mixed-data sampling regression model, i.e., combination-MIDAS model to forecast quarterly wastewater emissions in China based on dynamic factors at different frequencies. The results show that a significant auto-correlation for wastewater emissions exists and that water consumption per ten thousand gross domestic product is the best predictor of wastewater emissions. The forecast performances of the combination-MIDAS models are robust and better than those of the benchmark models. Therefore, the combination-MIDAS models can better capture the characteristics of wastewater emissions, suggesting that the proposed method is a good method to deal with model misspecification and uncertainty for the control and management of wastewater discharge in China.

View all citing articles on Scopus

View full text

Use of neurofuzzy networks to improve wastewater flow-rate forecasting

Abstract

Introduction

Section snippets

Formulation of the neurofuzzy wastewater flow-rate forecasting tool

Training and validation of the wastewater flow-rate forecasting FNN

Conclusions

Environmental Modelling & Software

Environmental Modelling & Software

Fuzzy Sets and Systems

Neural Networks

Environmental Modelling & Software

Environmental Modelling & Software

Computers and Mathematics with Applications

Applied Soft Computing Journal

Environmental Modelling & Software

Environmental Modelling & Software

Environmental Modeling & Software

Asymptotic statistical theory of overtraining and cross-validation

IEEE Transactions on Neural Networks