Research article
Data-based modeling and prediction of cytotoxicity induced by contaminants in water resources

https://doi.org/10.1016/j.compbiolchem.2011.02.001Get rights and content

Abstract

This paper is concerned with dynamic modeling, prediction and analysis of cell cytotoxicity induced by water contaminants. A real-time cell electronic sensing (RT-CES) system has been used for continuously monitoring dynamic cytotoxicity responses of living cells. Cells are grown onto the surfaces of the microelectronic sensors. Changes in cell number expressed as cell index (CI) have been recorded on-line as time series. The CI data are used to develop dynamic prediction models for cell cytotoxicity process. We consider support vector regression (SVR) algorithm to implement data-based system identification for dynamic modeling and prediction of cytotoxicity. Through several validation studies, multi-step-ahead predictions are calculated and compared with the actual CI obtained from experiments. It is shown that SVR-based dynamic modeling has great potential in predicting the cytotoxicity response of the cells in the presence of toxicant.

Research highlights

► We model dynamic cytotoxicity responses of living cells to 3 potential water toxicants. ► Support vector regression (SVR) is applied to develop nonlinear dynamic local models. ► The local SVR-based models demonstrate better predictive performance than the ANNs. ► The local SVR-based models are more robust to the increase of prediction horizons. ► SVR-based dynamic modeling has great potential in predicting cytotoxicity responses.

Introduction

Chemical disinfection of water was a major public health triumph of the 20th century. Yet, the ever-increasing number of chemical compounds produced by various process industries has prompted the development of research methods for rapid cytotoxicity screening to enhance water quality monitoring.

There are several methods for early warning monitoring to detect hazardous events in water supplies (Hasan et al., 2004). Two representative ones are analytical–chemical approach and biological approach. Analytical–chemical methods can detect a specific compound or a range of compounds having similar properties. This approach does not necessarily by itself give information about bioavailability and possible toxic effects (Brosnan, 1999). Therefore, the main weakness of the analytical–chemical approach is its inability to directly detect the toxicant effect on the living mechanisms. On the other hand, biological early warning systems or bioalarming systems are capable of signaling hazardous events and directly detecting the effect of the events, regardless of type and concentration of the substances. However, by its nature, bioalarming systems through the responses of living organisms give more false positives, and installation of such a system implies the acceptance of a certain risk of false positive. The limitations of the bioalarming system are therefore its uncertainty, complexity to track down, and slow response. Clearly, an early warning system that is capable of quick and reliable detection of the hazardous effects is yet desirable. During the past few decades, applications of mathematical modeling in the assessment of water quality have been widely investigated by Clark et al. (1986), Mazijk (1996) and others. In recent years, mathematical models have been established as a valuable supplement to the classical methods for online water quality monitoring (Yang et al., 2008).

Water contaminants have two major effects on human cells, namely, toxicity effects (cell killing by apoptosis and/or necrosis) and cancer effects (uncontrolled cell proliferation caused by cancer contaminants stimulations). In order to obtain predictions of these effects for early warning purposes, mathematical models can be developed to describe these effects on human cells. The models so obtained are able to predict cell responses to different values of toxicant concentration and to allow assessment of the biological consequences of toxic chemicals in environmental contamination (Ibrahim et al., 2010). The main objective of this work is thus to develop dynamic mathematical models to obtain predictions of cytotoxicity effects on living cells caused by certain water contaminants.

Cytotoxicity is the degree to which an agent possesses a specific toxic action on living cell referring to cells killing, cell lysis and certain cellular pathological changes, such as cellular morphological and adhesion change, induced by toxic agents. When exposing to toxic compounds, cells undergo physiological and pathological changes, including morphological dynamics, an increase or decrease in cell adherence to the extracellular matrix, cell cycle arrest, DNA damage, apoptosis and necrosis (Xing et al., 2005). Such cellular changes are dynamic and depend largely on cell types, the nature of a chemical compound, compound concentration, and compound exposure duration. Dynamic responses are typically described by dynamic models. At a very general level one can develop two different classes of dynamic models, namely knowledge-driven and data-driven. Knowledge-driven models are also called first-principle models because they have full phenomenological knowledge about the underlying mechanisms such as toxicant transport. In contrast to this, data-driven models are called black-box techniques because the model is only based on the historical relations among the existing measurements, and prevents one from the laborious study of complicated biological and physical phenomena involved (Ljung, 1999).

Regardless of the modeling approach, the dynamic experiment is the first step and the most important step to ensure the quality of the developed dynamic models. However, the dynamic monitoring of cytotoxicity is difficult to achieve in most of conventional cell based assays because they need chemical or radiation indicators that may kill or disturb target cells. For dynamic detection of a broad range of physiological and pathological responses to toxic agents in living cells, Xing et al. (2005) has investigated an automatic, real-time cell electronic sensing (RT-CES) system. The developed cell-based assay system allows for dynamic detection of a broad range of physiological and pathological responses to toxic agents in living cells.

Dynamic modeling of cytotoxicity using both system identification and first-principle methods has been shown feasible by Huang and Xing (2006). However, the development of a first-principle model becomes practically infeasible if the underlying mechanism is not truly understood. For instance, some phenomena observed in cytotoxicity experiments, such as initial cell fusion, have not been well understood and, thus, cannot be explained from first principles. The difficulties in developing first-principle models for cell-killing mechanisms induced by toxicant were discussed previously in Huang and Xing (2006) in details. Since certain dynamics in cytotoxicity process are very difficult or impossible to model from the first principles due to limited understanding of the complex underlying biochemical and morphological processes, the focus of this paper is thus on improving the performance of data-driven predictive models. Some techniques have been developed during the past years for data-driven dynamic modeling of cytotoxicity, most of which are based on time series analysis (Huang and Xing, 2005). However, the nature of cytotoxicity mechanisms is highly nonlinear. Being capable of incorporating the nonlinearity, artificial neural networks (ANNs) have reported good performances in short-term prediction of cell population dynamic response in the presence of toxicants as presented in Huang and Xing (2006). However, the model performance deteriorates when the prediction horizon increases. Also, there is no guarantee of convergence, avoidance of local minima and the overfitting phenomenon. Additionally, there are no general methods to specify the network architecture (Yan et al., 2004). In recent years, support vector regression (SVR) (Vapnik, 1999), which is a statistical learning theory based machine learning formalism, is gaining popularity due to its many attractive features and promising empirical modeling performance. While the empirical risk minimization (ERM) principle is generally employed in the traditional ANN, SVR implements the structural risk minimization (SRM) principle which seeks to minimize an upper bound of the generalization error rather than minimize the training error. Based on SRM principle, SVR achieves a balance between the training error and generalization error. Therefore, the overfitting phenomenon in traditional ANN can be avoided and a better generalized prediction performance can be obtained. Furthermore, the difficulties of choosing network structure are automatically handled in SVR.

The proven advantages of SVR inspire us to employ it in constructing a data-driven predictive model to improve the effectiveness and efficiency of cytotoxicity monitoring investigated in Huang and Xing (2006). Among the different formulations of the SVR problem, we adopt ν-SVR (Schölkopf et al., 2000) algorithm to form the core of the CI prediction framework. One of the standard MATLAB toolboxes, LIBSVM, is applied to the construction of cell index prediction models. The developed model is found capable of analyzing intrinsic cell behavior and predicting the trajectory of its progress (growth or death) over considerable time horizon.

Section snippets

Cytotoxicity experiments

The RT-CES system (ACEA Biosciences, CA, U.S.A.) has been used to monitor cellular events by measuring the electronic impedance of sensor electrodes integrated on the bottom of microtiter plates. The RT-CES system was described previously in details in Xing et al. (2005). Briefly, it is composed of three main components: an electronic sensor analyzer, a device station, and a 16 × microelectronic sensor device. Cells were grown onto the surfaces of microelectronic sensors, which are comprised of

Problem statement

An essential prerequisite for a successful early warning system is continuous collection of accurate data describing the risk of toxicant contamination. However, the key quality indicators are normally available through off-line sample analysis, which is often expensive and requires frequent and high-cost maintenance. Furthermore, a significant time delay of a few days is usually unavoidable in laboratory testing. Consequently, the lack of suitable key variable information in a timely manner

Support vector regression

The standard support vector regression algorithm is revisited in this section. We will emphasize on the principles of SVR, how to use it to solve the dynamic modeling problem, and how the tuning parameters should be chosen. For a more in depth discussion of the associated statistical learning theory, see Vapnik (1998).

SVR-based predictive model

A prediction model based on SVR is a data-driven model, which is based only on measurements.

Dynamic prediction

In this section, two different predictions are evaluated. One is relatively short-term and the other is long-term. For the short-term prediction, we consider one-step- (1 h) and five-step-ahead (5 h) predictions. For the long-term prediction, we consider a varying-horizon prediction, in which only first three measurements are used to predict all future responses. To demonstrate the pattern of prediction performance versus prediction horizons, a comparison of performance among one-step-,

Conclusion

In this paper, we have considered dynamic modeling and prediction of cytotoxicity induced by water contaminants. The CI data from a real-time cell electronic sensing (RT-CES) system has been used for dynamic modeling. Support vector regression was applied to develop black box nonlinear dynamic models. ν-SVR algorithm is recommended for the reason that it has the advantage of being able to automatically adjust the width of ɛ-insensitive tube. The developed models are verified using data that do

Acknowledgments

The authors gratefully acknowledge the financial support from the Natural Sciences and Engineering Research Council of Canada (NSERC) and Alberta Health and Wellness.

References (19)

There are more references available in the full text version of this article.

Cited by (7)

  • Recognition of chemical compounds in contaminated water using time-dependent multiple dose cellular responses

    2012, Analytica Chimica Acta
    Citation Excerpt :

    The ultimate objective of this study is to develop early detection systems to warn citizens as early as possible when the water is contaminated by toxins. Timely determination of the category of chemicals is a central element of a early warning system which will integrate other related devices, several software elements and other prediction features presented in our previous work [1,4,13,15,18]. In this paper, we have considered classification of several different toxins induced by water contaminants.

  • Online sensor for monitoring a microalgal bioreactor system using support vector regression

    2012, Chemometrics and Intelligent Laboratory Systems
    Citation Excerpt :

    Along with handling of system nonlinearities, the support vector learning methodology has other advantages over the traditional PCR and PLS methods which include better performance in the presence of outliers in the calibration dataset, superior modeling with a smaller dataset, and a simpler model (in terms of order) obtained based on the structural risk minimization (SRM) principle as opposed to empirical risk management (ERM), employed by the PLS and PCR methods. Based on the SRM principle, SVR achieves a balance between model error and model complexity, along with avoiding overfitting [30]. The combined application of Raman spectroscopy and support vector regression (SVR) was presented by Barman et al. [31] for monitoring blood glucose levels.

View all citing articles on Scopus
View full text