Data-driven process monitoring and fault analysis of reformer units in hydrogen plants: Industrial application and perspectives

https://doi.org/10.1016/j.compchemeng.2020.106756Get rights and content

Abstract

Reformer boxes are complex, integrated, and high-temperature units, subject to various failures during continuous operations for extended time periods. Challenges in the development of high-fidelity first principle models, despite easy availability of process measurements motivated the development of data-driven, automated fault detection (FD) systems. Paucity of plant-wide implementation of FD technologies in the chemical industry, accentuates the absence of relevant practical guidelines and best practices. In this paper, a trivially replicable FD system has been developed for large-scale industrial reformer boxes of hydrogen manufacturing units. Actual process data from plant historian has been used for training and validation of a novel model, developed using a combination of partial least squares regression and principal components analysis. Abnormalities based on several important measurements around the reformer were identified. Explicit algorithmic details and insights obtained during development of the expert system have been provided for ease of replication and adaptability.

Introduction

Large scale industrial processes warrant increasing demands on process safety, uniformity in production quality, and overall plant reliability. To achieve these objectives, real-time process monitoring is employed for early detection of abnormalities in processes and avoidance of severe equipment damages. Early abnormality detection significantly reduces maintenance and lost-production costs (Dash and Venkatasubramanian, 2000). For example, large scale syngas (gas mixture consisting primarily of hydrogen and carbon monoxide) production employs reformer furnaces with several hundred catalyst-filled tubes (see the details in Section 2). Since this process is highly integrated, a failure of a single tube can initiate a cascade of failures of several other tubes inside the furnace box. Since the catalyst tubes in a reformer box contribute to approximately 15–20% of the total capital cost, the economic implications of early fault detection are significant. Industrial statistics estimate the economic impact due to unplanned outages at petrochemical plants alone to be around $20 billion per year (Nimmo 1995). However, efficient process monitoring and control has remained a challenge due to the increasing complexity of process systems and practical limitations on continuous manual monitoring. Hence, process monitoring, also commonly termed as fault detection and diagnosis (FDD), has been an active field of research over the past few decades (MacGregor and Kourti 1995; Venkatasubramanian et al., 2003c; Venkatasubramanian, Rengaswamy, Kavuri, 2003, Venkatasubramanian, Rengaswamy, Kavuri, Yin, 2003, Severson et al., 2016).

FDD can be broadly classified into three categories: analytical (model-based), knowledge-based and data-driven methods (Alzghoul et al., 2014). The analytical method relies on first-principles-based mathematical models of the process and incorporation of physical understanding of the system into the fault detection process. While the analytical models are expected to provide superior accuracy, development of a high-fidelity mathematical model of complex industrial processes can be difficult. For example, modeling radiative heat transfer inside large-scale furnace boxes in syngas plants can be quite involved (Kumar et al., 2015). The computationally-intensive calculations make model-based real-time monitoring of complex industrial processes infeasible.

Knowledge-based methods are rule-based expert systems where the rules are derived from process engineers’ experience and plant operators’ intuitive knowledge of the underlying process. For example, plant operators may notice that openings of flow-control valves exhibit specific characteristic patterns before failure; rules can be framed based on these patterns to detect valve issues to prevent severe process disturbances. However, it is difficult and time-consuming to create an exhaustive collection of rules that covers a wide range of potential process faults. Additionally, for novel processes, the knowledge base can be extremely sparse. A good review of knowledge-based methods for fault analysis and diagnostics can be found in the work of (Venkatasubramanian et al., 2003a).

Data-driven approach for FDD utilizes historical process data. The data contain process information and capture intrinsic process complexities. Hence it can be used for modeling, monitoring, and control (Kano and Nakagawa 2008); (MacGregor et al., 2005). Process data from faulty and normal plant operations can be used to develop classification models that classify process conditions into faulty and normal classes; (Yin et al., 2014b) however, large amount of data from faulty plant operations are generally not available to build accurate classification models. Alternatively, data from a wide range of normal plant operations can be used to build statistical models to determine whether the process is operating normally. These multivariate statistical process monitoring (MSPM) methods. (MacGregor and Kourti, 1995) have recently become more popular due to the rapid development of process instrumentation and data acquisition technology, and wide utilization of distributed control system (DCS) in modern industrial processes. Due to the ease of accessibility to process data, and the abundance of inexpensive data management systems, the volume of data generated from large-scale industrial processes have been on the rise (Alzghoul et al., 2012). Principal component analysis (PCA) and partial least squares (PLS) are among the most popular MSPM techniques and rely on the projection of the high-dimensional process data onto a lower dimensional space through latent variables for extraction of key process information (AlGhazzawi and Lennox 2008; Qin 2003; Flores-Cerrillo and MacGregor 2004). Several other data-based methods, such as independent component analysis (ICA), artificial neural networks (ANNs), support vector machines (SVM), kernel PCA/PLS, recursive PCA/PLS, gaussian mixture models (GMMs), etc., have been explored to deal with issues such as non-gaussianity, non-linearity, non-stationarity, and multiple operating modes (Lee et al., 2004; Cho et al., 2005; Qin 1998). Excellent reviews on the state-of-the-art data-driven FDD methods for industrial processes can be found in literature (Qin, 2012; Ge, et al., 2013; Yin et al., 2014a; Ge, 2017; Reis and Gins 2017).

Empirical/data-driven models are especially useful in industrial settings where low development time and payback period, speed of implementation, and robustness to practical issues such as missing data are important. As discussed above, a plethora of data-driven techniques is available for the development of a process monitoring tool. However, each method has its advantages and shortcomings; a method that works well for one system might not exhibit satisfactory performance for another (Dash and Venkatasubramanian, 2000; Ge et al., 2013 Ng and Srinivasan, 2010; Perk et al., 2010). have proposed combining multiple FDD methods in a multi-agent system for process monitoring; these multi-agent systems, however, do not make selection of specific FDD techniques any easier. Most of the FDD techniques in literature are benchmarked against its performance on the Tennessee Eastman (TE) process (Downs and Vogel 1993; Howell et al., (1997). However, as noted by Chiang et al. (2017), this problem is somewhat antiquated.

Very few studies are available on application of FDD methods on real large-scale industrial systems. Paucity of publicly available literature on demonstrated success of data-driven FDD techniques on real large-scale industrial systems and strategies to overcome the practical challenges (Kano and Nakagawa, 2008) (related to model adaptation, model maintainability, data pre-processing, etc.) faced during industrial applications are some of the reasons for low industrial adoption of these techniques. Additionally, current research hints towards context specificity, suggesting that modelers need to create appropriate combination of tools designed for each application. Albeit, in industry, it is common to find a conservative process monitoring approach where static upper and lower alarm thresholds are used for a few key process variables, however, as shown later in the text, this conservative approach leads to relatively delayed detection of faults. Relatively faster detection is dependent on plant operators serendipitously identifying these faults from process graphs on HMI (human machine interface) screens in the control room. Of course, this is not an ideal solution to the fault detection problem.

In this paper, results from development and application of an expert process monitoring system, based on data-driven FDD methodologies, for monitoring reformer boxes in hydrogen plants are presented. Reformer box is a physically large (16 m  × 16 m × 12 m) scale unit operating at  ~ 1800 F to convert natural gas (methane) into syngas (Kumar et al., 2015). Several methods available in literature are compared on their fault detection capabilities. Methods for both steady-state and dynamic process are applied and detailed step-by-step procedures have been provided. Further, the developed expert system is successfully tested on two separate hydrogen plants to demonstrate the system’s replicability. The paper is organized as follows. Section 2 provides a brief overview of the hydrogen production process. Reformer monitoring workflow is then described followed by a brief discussion on historical process data and data pre-processing. Sections 5 and 6 provide the details of algorithm and subsequent results from the application of steady-state and dynamic monitoring algorithms respectively.

Section snippets

Reformer box and hydrogen plant

In this section, we describe the reforming process in a large-scale hydrogen plant typically producing more than a hundred million square cubic-feet per day of high-purity hydrogen. The purpose of this section is to highlight the heat-intensive, integrated nature of the HyCO process, which justifies the utility of intelligent data-driven FD technologies, for ease of replication. We begin by describing a representative flowsheet, followed by description of a repertoire of faults that warrant

Reformer monitoring workflow

Fig. 2 shows the workflow for real-time reformer-unit monitoring system. The workflow executes at regular intervals on a schedule. During each run, recent process data are retrieved from historian and pre-processed to filter out measurement noise and bad measurements. Pre-processed data is analyzed by the monitoring algorithm and the computed process metrics are compared against threshold values. Upon successful fault detection, the faulty variables are identified by the fault-diagnosis block

Plant historical data and pre-processing

About 3 years of data, sampled every minute, consisting about 1.4 million measurement samples, have been used for the development of the fault-detection model. A moving-average filter with 30 min (30 samples) window is used to remove measurement noise in the raw historian data. Fig. 3 shows the filtered historical data for some key reformer input variables. Significant variations in values of the input variables have been captured in the data. Plant operation state changes due to changes in

Steady-state monitoring algorithm

Steady-state monitoring algorithms train fault-models using steady-state training data to identify normal operating states. During model-training, control limits on one or more fault detection indices are determined. During testing, faulty/abnormal process data that deviates from the normal process behavior and consequently, violate the control limits are flagged as faults.

Dynamic monitoring algorithm

For a dynamic system, the current values of a system depend on the past values. For the reformer, it has been shown that the operating conditions vary frequently according to production requirements. Therefore, it is important that the temporal correlations among reformer variables are also taken into account during development of the fault-model; this enables the model to be used for fault-detection irrespective of whether the reformer is at steady-state or not. A convenient data-based scheme

Monitoring algorithm workflow

Post-deployment of the process monitoring tool, false positives (fault alarm without actual fault) and false negative (no fault alarm when there are actual faults) are both undesirable; while the former leads to a loss of user’s confidence in the tool, the latter leads to delayed fault detection by the plant operators. In previous sections, it was observed that while steady-state external analysis can lead to false positives during normal process transients, dynamic external analysis can lead

FD application at identical plant

An advantage offered by the data-based monitoring methodologies is that once a method has been found that gives satisfactory performance for a particular system, the method can potentially be applied without many modifications for monitoring other similar systems. This keeps the investment of labor, time, and money for model development during tool replication low. For reformer monitoring, the external analysis-based method was repeated for monitoring another identical reformer-based hydrogen

Discussion and recommendations

In the previous sections, it was shown how process relationships can be extracted from process data to build process and fault-models for reformer boxes quickly. While this is very convenient, model-developers (and the end-users) should be aware of the limitations of data-driven approaches. If the historical process data, for some reason, do not accurately represent the behavior of the current system, frequency of false alarms will increase significantly. For example, in the hydrogen plant,

Conclusion

In this work, development and application of process data-based process monitoring has been reported for a large-scale reformer-box unit of a hydrogen manufacturing plant. A complete expert system workflow, from retrieving data from data historian to displaying fault details on plant operator’s screen, has been provided. External analysis was found to provide the best fault-detection performance. Similar fault-detection performance was obtained during direct application of the method at another

CRediT authorship contribution statement

Ankur Kumar: Conceptualization, Methodology, Software, Formal analysis, Data curation, Writing - original draft, Visualization, Project administration. Apratim Bhattacharya: Conceptualization, Methodology, Formal analysis, Writing - original draft. Jesus Flores-Cerrillo: Writing - review & editing, Supervision.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References (47)

  • Y. Dong et al.

    Regression on dynamic PLS structures for supervised learning of dynamic data

    J. Process Control

    (2018)
  • J.J. Downs et al.

    A plant-wide industrial process control problem

    Comput. Chem. Eng.

    (1993)
  • Z. Ge

    Review on data-driven modeling and monitoring for plant-wide industrial processes

    Chemometr. Intell. Lab. Syst.

    (2017)
  • M. Kano et al.

    Evolution of multivariate statistical process control: application of independent component analysis and external analysis

    Comput. Chem. Eng.

    (2004)
  • M. Kano et al.

    Data-based process monitoring, process control, and quality improvement: recent developments and applications in steel industry

    Comput. Chem. Eng.

    (2008)
  • W. Ku et al.

    Disturbance detection and isolation by dynamic principal component analysis

    Chemometr. Intell. Lab. Syst.

    (1995)
  • A. Kumar et al.

    A physics-based model for industrial steam-methane reformer optimization with non-uniform temperature field

    Comput. Chem. Eng.

    (2017)
  • A. Kumar et al.

    Multi-resolution model of an industrial hydrogen plant for plantwide operational optimization with non-uniform steam-methane reformer temperature field

    Comput. Chem. Eng.

    (2017)
  • J.-M. Lee et al.

    Statistical process monitoring with independent component analysis

    J. Process Control

    (2004)
  • W. Li et al.

    Recursive PCA for adaptive process monitoring

    J. Process Control

    (2000)
  • J.F. MacGregor et al.

    Statistical process control of multivariate processes

    Control Eng. Pract.

    (1995)
  • J.F. MacGregor et al.

    Data-based latent variable methods for process analysis, monitoring and control

    Comput. Chem. Eng.

    (2005)
  • S.J. Qin

    Survey on data-driven industrial process monitoring and diagnosis

    Annu. Rev. Control

    (2012)
  • Cited by (0)

    View full text