Skip to main content
Log in

A multivariate process monitoring strategy and control concept for a small-scale fermenter in a PAT environment

  • Published:
Journal of Intelligent Manufacturing Aims and scope Submit manuscript

Abstract

This work describes a multivariate monitoring and control concept for bioprocesses based on historical process data. The concept is demonstrated for a Saccharomyces Cerevisiae (baker’s yeast) fermentation process executed in a small-scale bioreactor, which is equipped with common probes to analyze the broth and off-gases. The data of “in-control” fermentation processes were evaluated by means of a principal component analysis to define confidence limits for subsequent fermentations. A violation of these limits indicated that a process had to be classified as “out-of-control”. Fault diagnosis was provided by the components of the squared prediction error, which can also be used to determine the appropriate counteractions, e.g. via an expert system control strategy as described in this study. The sensitivity of fault diagnosis was demonstrated via various erroneous runs. The duration of bioprocesses can vary distinctly, which complicates the definition of time dependent control limits. Therefore, this study utilizes a three-component partial least squares regression model to quantify the current batch maturity during the process. This maturity is then used to reference current data to the appropriate historical data and the assigned control limits.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Notes

  1. \(X_{res} =\frac{1}{J-A}\mathop {\sum }\limits _{j=1}^J \left( {x_j -{\widehat{x}}_{j,A}} \right) ^{2}\) The X-residual is a similar expression found in literature, with the only difference that it is weighted using A, the number of principal components used in the model. \(X_{res}\) increases with A, and vice versa.

Abbreviations

n :

Index of the batch

k :

Index of the maturity

j :

Index of the variable

a :

Index of PLS or PCA dimensions

A :

Number of considered PLS or PCA dimensions

\(\varvec{X}\) :

Standardized observations for PLS model(s) (no. of rows = no. of observations; no. of columns = no. of variables)

\(\varvec{Y}\) :

Standardized response for PLS models(s), i.e., the maturity

\(\widehat{{\varvec{Y}}}\) :

(Non standardized) response for PLS models(s), i.e., the maturity

\(\varvec{E}\) :

Residual matrix for \(\varvec{X}\) in PLS

\(\varvec{F}\) :

Residual matrix for \({\varvec{Y}}\) in PLS

\(t_a \) :

Score values of the \(a^\mathrm{th}\) PC of the PLS model(s)

\(\varvec{p}_{\mathbf {a}}, \varvec{q}_{\mathbf {a}}\) :

(Normalized) loading vectors of the ath PC of the PLS model(s)

\(\varvec{Y}_{PLS i} \) :

Maturity predicted by an \(i^\mathrm{th}\) PLS model using 90 % of the observations in \(\varvec{X}\)

\(\mathop {\varvec{X}}\limits ^{\prime } \) :

Standardized observations for PCA model(s) (no. of rows = no. of observations; no. of columns = no. of variables)

\(\mathop {\varvec{x}}\limits ^{\prime }\) :

Single standardized observation

\(\mathop {\varvec{E}}\limits ^{\prime } \) :

Residual matrix for \(\mathop {\varvec{X}}\limits ^{\prime }\) in PCA

\({\mathop {\varvec{p}}\limits ^{\prime }}_a \) :

(Normalized) loading vectors of the \(a^\mathrm{th}\) PC of the PCA model

\(\overline{{\mathop {t}\limits ^{\prime }}_{a}}\) :

Score values of the \(a^\mathrm{th}\) principal component averaged over the in-control batches

mat :

Batch maturity

References

  • Albert, S., & Kinley, R. D. (2001). Multivariate statistical monitoring of batch processes: An industrial case study of fermentation supervision. Trends in Biotechnology, 19(2), 53–62. http://www.ncbi.nlm.nih.gov/pubmed/11164554.

  • Alford, J. S. (2006). Bioprocess control: Advances and challenges. Computers & Chemical Engineering, 30(10–12), 1464–1475. doi:10.1016/j.compchemeng.2006.05.039.

    Article  Google Scholar 

  • Alt, F. B., & Smith, N. D. (1988). Quality control and reliability. Handbook of statistics, vol. 7. Handbook of statistics. Amsterdam: Elsevier. doi:10.1016/S0169-7161(88)07019-1.

  • Chiang, L. H., Leardi, R., Pell, R. J., & Seasholtz, M. B. (2006). Industrial experiences with multivariate statistical analysis of batch process data. Chemometrics and Intelligent Laboratory Systems, 81(2), 109–119. doi:10.1016/j.chemolab.2005.10.006.

    Article  Google Scholar 

  • Cimander, C., Bachinger, T., & Mandenius, C.-F. (2003). Integration of distributed multi-analyzer monitoring and control in bioprocessing based on a real-time expert system. Journal of Biotechnology, 103(3), 237–248. doi:10.1016/S0168-1656(03)00121-4.

    Article  Google Scholar 

  • Doan, X.-T., & Srinivasan, R. (2008). Online monitoring of multi-phase batch processes using phase-based multivariate statistical process control. Computers & Chemical Engineering, 32(1–2), 230–243. doi:10.1016/j.compchemeng.2007.05.010.

    Article  Google Scholar 

  • FDA. (2004). Guidance for industry: PAT—A framework for innovative pharmaceutical development, manufacturing, and quality assurance. Pharmaceutical CGMPs.

  • Ferreira, A. P., Lopes, J. A., & Menezes, J. C. (2007). Study of the Application of multiway multivariate techniques to model data from an industrial fermentation process. Analytica Chimica Acta, 595(1–2), 120–127. doi:10.1016/j.aca.2007.05.007.

    Article  Google Scholar 

  • Fransson, M., & Folestad, S. (2006). Real-time alignment of batch process data using COW for on-line process monitoring. Chemometrics and Intelligent Laboratory Systems, 84(1–2), 56–61. doi:10.1016/j.chemolab.2006.04.020.

    Article  Google Scholar 

  • Gao, W. J., Jane, H. J., Lin, K. T. L., & Liao, B. Q. (2010). Influence of elevated pH shocks on the performance of a submerged anaerobic membrane bioreactor. Process Biochemistry, 45(8), 1279–1287. doi:10.1016/j.procbio.2010.04.018.

    Article  Google Scholar 

  • Glassey, J., Montague, G., & Mohan, P. (2000). Issues in the development of an industrial bioprocess advisory system. Trends in Biotechnology, 18(4), 136–41. http://www.ncbi.nlm.nih.gov/pubmed/10740258.

  • González-Martínez, J. M., Ferrer, A., & Westerhuis, J. A. (2011). Real-time synchronization of batch trajectories for on-line multivariate statistical process control using dynamic time warping. Chemometrics and Intelligent Laboratory Systems, 105(2), 195–206. doi:10.1016/j.chemolab.2011.01.003.

    Article  Google Scholar 

  • Gregersen, L., & Jørgensen, S. B. (1999). Supervision of fed-batch fermentations. Chemical Engineering Journal, 75(1), 69–76. doi:10.1016/S1385-8947(99)00018-2.

    Article  Google Scholar 

  • Honda, H., & Kobayashi, T. (2004). Industrial application of fuzzy control in bioprocesses. Advances in Biochemical Engineering/biotechnology, 87, 151–71. http://www.ncbi.nlm.nih.gov/pubmed/15217106.

  • Ijima, H., Kakeya, Y., Ogata, T., & Sakai, T. (2009). Development of a practical small-scale circulation bioreactor and application to a drug metabolism simulator. Biochemical Engineering Journal, 44(2–3), 292–296. doi:10.1016/j.bej.2008.12.015.

    Article  Google Scholar 

  • International Conference on Harmonization (2004). Guidance for Industry: Q8(R2) Pharmaceutical Developement.

  • International Conference on Harmonization (2009). Guidance for Industry: Q9 Quality Risk Management.

  • Jaumot, J., Igne, B., Anderson, C. A., Drennen, J. K., & de Juan, A. (2013). Blending process modeling and control by multivariate curve resolution. Talanta, 117(117C), 492–504. doi:10.1016/j.talanta.2013.09.037.

    Article  Google Scholar 

  • Jiménez-González, C., & Woodley, J. M. (2010). Bioprocesses: Modeling needs for process evaluation and sustainability assessment. Computers & Chemical Engineering, 34(7), 1009–1017. doi:10.1016/j.compchemeng.2010.03.010.

    Article  Google Scholar 

  • Jørgensen, P., Pedersen, J. G., Jensen, E. P., & Esbensen, K. H. (2004). On-line batch fermentation process monitoring (NIR)-introducing‘biological process time. Journal of Chemometrics, 18(2), 81–91. doi:10.1002/cem.850.

    Article  Google Scholar 

  • Kandel, T. P., Gislum, R., Jørgensen, U., & Lærke, P. E. (2013). Prediction of biogas yield and its kinetics in reed canary grass using near infrared reflectance spectroscopy and chemometrics. Bioresource Technology, 146(October), 282–287. doi:10.1016/j.biortech.2013.07.092.

    Article  Google Scholar 

  • Karadag, D., & Puhakka, J. A. (2010). Effect of changing temperature on anaerobic hydrogen production and microbial community composition in an open-mixed culture bioreactor. International Journal of Hydrogen Energy, 35(20), 10954–10959. doi:10.1016/j.ijhydene.2010.07.070.

    Article  Google Scholar 

  • Kourti, T. (2006). Process analytical technology beyond real-time analyzers: The role of multivariate analysis. Critical Reviews in Analytical Chemistry, 36(3–4), 257–278. doi:10.1080/10408340600969957.

  • Kourti, T., Nomikos, P., & MacGregor, J. F. (1995). Analysis, monitoring and fault diagnosis of batch processes using multiblock and multiway PLS. Journal of Process Control, 5(4), 277–284. doi:10.1016/0959-1524(95)00019-M.

    Article  Google Scholar 

  • Kresta, J. V., Macgregor, J. F., & Marlin, T. E. (1991). Multivariate statistical monitoring of process operating performance. The Canadian Journal of Chemical Engineering, 69(1), 35–47. doi:10.1002/cjce.5450690105.

    Article  Google Scholar 

  • Lee, D. S., & Vanrolleghem, P. A. (2003). Monitoring of a sequencing batch reactor using adaptive multiblock principal component analysis. Biotechnology and Bioengineering, 82(4), 489–497. doi:10.1002/bit.10589.

    Article  Google Scholar 

  • Lennox, B., Montague, G. A., Hiden, H. G., Kornfeld, G., & Goulding, P. R. (2001). Process monitoring of an industrial fed-batch fermentation. Biotechnology and Bioengineering, 74(2), 125–35. doi:10.1002/bit.1102.

  • Lopes, J. A., Menezes, J. C., Westerhuis, J. A., & Smilde, A. K. (2002). Multiblock PLS analysis of an industrial pharmaceutical process. Biotechnology and Bioengineering, 80(4), 419–427. doi:10.1002/bit.10382.

    Article  Google Scholar 

  • Luttmann, R., Borchert, S.-O., Mueller, C., Loegering, K., Aupert, F., Weyand, S., et al. (2015). Sequential/parallel production of potential malaria vaccines—A direct way from single batch to quasi-continuous integrated production. Journal of Biotechnology, 213(February), 83–96. doi:10.1016/j.jbiotec.2015.02.022.

    Article  Google Scholar 

  • MacGregor, J. F., Jaeckle, C., Kiparissides, C., & Koutoudi, M. (1994). Process monitoring and diagnosis by multiblock PLS methods. AIChE Journal, 40(5), 826–838. doi:10.1002/aic.690400509.

    Article  Google Scholar 

  • MacGregor, J. F., & Kourti, T. (1995). Statistical process control of multivariate processes. Control Engineering Practice, 3(3), 403–414. doi:10.1016/0967-0661(95)00014-L.

    Article  Google Scholar 

  • Martin, E. B., Morris, A. J., & Zhang, J. (1996). Process performance monitoring using multivariate statistical process control. IEE Proceedings—Control Theory and Applications, 143(2), 132–144. doi:10.1049/ip-cta:19960321.

    Article  Google Scholar 

  • Menezes, J. C. (2011). Comprehensive biotechnology. comprehensive biotechnology. Amsterdam: elsevier. doi:10.1016/B978-0-08-088504-9.00205-1.

  • Nomikos, P., & MacGregor, J. F. (1995a). Multivariate SPC charts for monitoring batch processes. Technometrics, 37(1), 41–59. doi:10.1080/00401706.1995.10485888.

  • Nomikos, P., & MacGregor, J. F. (1995b). Multi-way partial least squares in monitoring batch processes. Chemometrics and Intelligent Laboratory Systems, 30(1), 97–108. doi:10.1016/0169-7439(95)00043-7.

  • Rathore, A. S. (2014). QbD/PAT for bioprocessing: Moving from theory to implementation. Current Opinion in Chemical Engineering, 6, 1–8. doi:10.1016/j.coche.2014.05.006.

    Article  Google Scholar 

  • Ryan, T. P. (2011). Statistical methods for quality improvement (3rd ed.). New Jersey: Wiley.

  • Sarraguça, M. C., Ribeiro, P. R. S., Santos, A. O., Silva, M. C. D., & Lopes, J. A. (2014). A PAT approach for the on-line monitoring of pharmaceutical co-crystals formation with near infrared spectroscopy. International Journal of Pharmaceutics, 471(1–2), 478–484. doi:10.1016/j.ijpharm.2014.06.003.

    Article  Google Scholar 

  • Shewhart, W. A. (1986). Statistical method from the viewpoint of quality control. Edited by W. Edwards Deming. Dover.

  • Varmuza, K., & Filzmoser, P. (2009). Introduction to multivariate statistical analysis in chemometrics. boca raton: CRC Press/Taylor & Francis.

    Book  Google Scholar 

  • Vojinović, V., Cabral, J. M. S., & Fonseca, L. P. (2006). Real-time bioprocess monitoring. Sensors and Actuators B: Chemical, 114(2), 1083–1091. doi:10.1016/j.snb.2005.07.059.

    Article  Google Scholar 

  • Wold, S., Kettaneh, N., Fridén, H., & Holmberg, A. (1998). Modelling and diagnostics of batch processes and analogous kinetic experiments. Chemometrics and Intelligent Laboratory Systems, 44(1–2), 331–340. doi:10.1016/S0169-7439(98)00162-2.

    Article  Google Scholar 

  • Wold, S., Kettaneh-Wold, N., MacGregor, J. F., & Dunn, K. G. (2009). Comprehensive chemometrics. Comprehensive chemometrics. Amsterdam: Elsevier. doi:10.1016/B978-044452701-1.00108-3.

    Book  Google Scholar 

  • Wold, S., Sjöström, M., & Eriksson, L. (2001). PLS-regression: A basic tool of chemometrics. Chemometrics and Intelligent Laboratory Systems, 58(2), 109–130. doi:10.1016/S0169-7439(01)00155-1.

    Article  Google Scholar 

  • Yang, W.-A. (2013). Monitoring and diagnosing of mean shifts in multivariate manufacturing processes using two-level selective ensemble of learning vector quantization neural networks. Journal of Intelligent Manufacturing, 26(4), 769–783. doi:10.1007/s10845-013-0833-z.

    Article  Google Scholar 

  • Zhu, D., Bai, J., & Yang, S. X. (2010). A multi-fault diagnosis method for sensor systems based on principle component analysis. Sensors (Basel, Switzerland), 10(1), 241–253. doi:10.3390/s100100241.

    Article  Google Scholar 

Download references

Acknowledgments

We would like to thank Johannes Österreicher and Johannes Scheiblauer for support in terms of sensor implementation and execution of fermentations.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maximilian O. Besenhard.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (docx 280 KB)

Appendix: Statistically in-control/out-of-control

Appendix: Statistically in-control/out-of-control

The selection of historical data for the generation of (M)SPC is commonly referred as Phase I. The latter is of upmost importance to obtain statistical models that are sensitive to out-of-control events in future batches (Phase II).

First of all, it has to be defined how to classify processed batches as in-control and why. These questions depend on the chosen quality characteristics, i.e., if process histories (selected variables at various points in time) or simply product quality attributes are considered. Since the latter are commonly multivariate as well (e.g., yield, purity, metabolite concentrations, etc.) this appendix discusses retrospective testing of batch processes via multivariate analysis.

Assuming that \(\nu \) quality characteristics (e.g. \(\nu = \mathbf{JK}\) for process histories, unfolding the three-way data structure as shown in Fig. 1b) follow a \(\nu \)-variate normal distribution, Hotelling’s \(\hbox {T}^{2}\) statistic is applied. \({\upchi }^{2}\) statistics are used if (co)variances are known or can be estimated accurately, e.g. if the number of observables is much higher than \(\nu \).

Hotelling’s \(\hbox {T}^{2}\) statistic for deviations from a known mean

\(\hbox {T}^{2}\) as defined in Eq. 10 is used to indicate if averaged quality characteristics deviate significantly from a known \(\upmu \), representing the process mean under stable conditions.

$$\begin{aligned} T^{2}=n\cdot \left( {{\bar{x}} -\upmu } \right) ^{\mathrm{T}}\cdot S^{-1}\cdot ({{\bar{x}} -\upmu }) \end{aligned}$$
(10)

Here \(\upmu \) and \({\bar{x}} \) are \(\nu \) -dimensional vectors, \(S^{-1}\) is the inverse of the estimated covariance matrix and n is the number of observations (i.e., batches) averaged to obtain \({\bar{x}} \). When \(\upmu =\upmu _0 \) for the null hypothesis (e.g. when quality characteristics are assumed to fluctuate around known/expected values), \(T^{2}\) is distributed like:

$$\begin{aligned} \frac{\nu \cdot ({n-1})}{n-\nu } \cdot F_{({\nu ,n-\nu })}, \end{aligned}$$
(11)

where \(F_{({\nu , n-\nu })}\) refers to the F-distribution with \(\nu \) and \(n-\nu \) degrees of freedom. Hence, an upper control limit with a significance level \(\alpha \) can be defined as:

$$\begin{aligned} T^{2}_{ucl} =\frac{\nu \cdot ({n-1})}{n-\nu }\cdot F_{1-\alpha ({\nu , n-\nu })} \end{aligned}$$
(12)

Hotelling’s \(\hbox {T}^{2}\) statistic for individual observations

As discussed in Ryan (2011) Eq. 13 is used to investigate \(i=1,2, \ldots , m\) individual multivariate observations.

$$\begin{aligned} T_i^2 =({x_i -{\bar{x}}_m})^{\mathrm{T}}\cdot S_m^{-1} \cdot ({x_i-{\bar{x}}_m}) \end{aligned}$$
(13)

The \(T_i^2 \) statistics represent the distance of (future) individual observations \(x_i \) from the mean vector \({\bar{x}}_m\) weighted by the covariance matrix \(S_m\). Here \({\bar{x}}_m \) and \(S_m\) are estimated from all m observations. For future observations, \(T_i^2 \) is distributed like:

$$\begin{aligned} \frac{\nu \cdot ({m+1})\cdot ({m-1})}{m \cdot ({m-\nu })}\cdot F_{({\nu , m-\nu })} \end{aligned}$$
(14)

and an upper control limit can be defined as:

$$\begin{aligned} {T_{i}^2}_{ucl} =\frac{\nu \cdot ({m+1})\cdot ({m-1})}{m\cdot ({m-\nu })}\cdot F_{1-\alpha ({\nu , m-\nu })} \end{aligned}$$
(15)
Fig. 14
figure 14

Score plot of the first PC using (top) fermentation 1–8 and (bottom) fermentation 1–6, i.e., without the outliers

PCA for retrospective testing of batches

\(T^{2}\) statistics cannot be computed if quality characteristics are highly correlated, as the covariance matrix gets non-invertible. This issue could be eliminated with the aid of PCA and \(T^{2}\) statistics based on scores. Furthermore, the number of observations can be small compared to the number of quality characteristics which requires data compression e.g. via PCA.

Below, these principles are demonstrated using the fermentation runs discussed in this work (1–6) and two additional runs (7–8) which were not considered in Phase I.

In order to test if the fermentations are statistically in-control, the three-way data structure is unfolded as shown in Fig. 1b, i.e. N observables with \(\nu = \mathbf{JK}\) variables with. Afterwards, a PCA is executed to reduce the number of variables \(\nu \). PCA can provide an output even if \(\mathbf{N}< \nu \). In that case, PCs of a higher number than N do not contain valuable information. Furthermore, the robustness of PCA results need to be tested, e.g. via bootstrapping, before scores are used in \(T^{2}\) statistics. Since only 6–8 batches were considered here (i.e., \(\mathbf{N}=8\)) the number of variables had to be reduced to obtain consistent results. Therefore, only \({og-CO}_{2}\) and \({pO}_{2}\) (see Table 1) were used (i.e., J \(=\) 2) at 100 time bins (i.e., \(\mathbf{K}=100\)).

Fig. 15
figure 15

\(T^{2}\) values of the individual observations, i.e., fermentation runs, and upper control limits for various significance levels

Fig. 16
figure 16

\(95\,\% \,(\alpha =0.05)\) and 90 % \((\alpha =0.10)\) upper control limits as calculated by Eq. 15 and \(T^{2}\) values of ten individual future observations using 6–17 observations to estimate \({\bar{x}}_m \) and \(S_m\) in Eq. 13. Here, only normally distributed random numbers were used as observations

Score plots are a useful tool to screen batches for abnormalities. Figure 14 shows the projections of batches 1–8 and 1–6 in the \(\hbox {t}_{1}-\hbox {t}_{2}\) plane. This score plot indicates clearly that fermentations 7–8 differ from fermentations 1–6. PCA results were shown to be robust for the first three PCs, but the first two were used in \(T^{2} \) statistics for individual observations (Eqs. 1315), since they explain already 78 % of the variance. Figure 15 shows the obtained \(T^{2} \)values when considering only fermentations 1–6 in the PCA. This plot quantifies again the discrepancy between the fermentations and why fermentations 7–8 were not considered in Phase I. Such \(T^{2}\) charts can be used for the retrospective testing (exploratory data analysis) if quality characteristics of a processed batch vary significantly from historical data (i.e., Phase II chart).

In order to test if the statistical variations among fermentations are significant, the number of observables used here is rather low. If the number of observables and variables are of the same order upper control limits might not be estimated accurately. This is captured in Fig. 16, showing upper control limits as calculated via Eq. 15 and \(T^{2}\) for future observations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Besenhard, M.O., Scheibelhofer, O., François, K. et al. A multivariate process monitoring strategy and control concept for a small-scale fermenter in a PAT environment. J Intell Manuf 29, 1501–1514 (2018). https://doi.org/10.1007/s10845-015-1192-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10845-015-1192-8

Keywords

Navigation