Attention-Based Convolutional Aggregation: An Efficient Model for Off-Gas Profile Forecasting and Dynamic Pre-Control of BOF Steelmaking

Xie, Tian-yi; Zhang, Jun-guo; Li, Lan-jie; Zhang, Fei; Liu, Shuai; Guo, Han-jie

doi:10.1007/s44196-024-00713-3

Attention-Based Convolutional Aggregation: An Efficient Model for Off-Gas Profile Forecasting and Dynamic Pre-Control of BOF Steelmaking

Research Article
Open access
Published: 23 December 2024

Volume 17, article number 312, (2024)
Cite this article

Download PDF

You have full access to this open access article

International Journal of Computational Intelligence Systems Aims and scope Submit manuscript

Attention-Based Convolutional Aggregation: An Efficient Model for Off-Gas Profile Forecasting and Dynamic Pre-Control of BOF Steelmaking

Download PDF

Tian-yi Xie¹,
Jun-guo Zhang³,
Lan-jie Li²,
Fei Zhang^2,4,
Shuai Liu⁵ &
…
Han-jie Guo⁶

485 Accesses
Explore all metrics

Abstract

This study proved that the curves of carbon monoxide (CO), carbon dioxide (CO₂), and CO + CO₂ in the off-gas profile were forecastable, and realized a 32-s-ahead forecasting for them. It established a technical foundation for addressing the delay in off-gas profile display and for enabling pre-control in BOF steelmaking based on the forecasted curves’ features. First, a data pre-processing method was proposed based on the characteristics of the off-gas curves, where there are many samples, but each sample contains limited time-steps. It is termed the mixed-batch approach. The importance of the time series’ channels and time-steps were also analyzed by models with attention mechanism. Then, a deep-learning model is proposed to forecast the dynamic off-gas profile, named attention-based convolutional aggregation (ABCA). It incorporates artificial intelligence (AI) techniques, such as aggregation structures, causal dilation convolution, attention mechanisms, residual connections, etc. Its forecasting coefficient of determination (R²) values for the curves of CO, CO₂, and CO + CO₂ reached 0.9386, 0.8566, and 0.9428, respectively, while the mean squared errors (MSEs) values were 47.3884, 11.9314, and 54.3583, respectively. These results outperform the benchmark state-of-the-art (SOTA) models. Additionally, ABCA was implemented in a forecasting tool for external validation. The results of external validation showed that ABCA has good forecasting accuracy and robustness. What is more, approaches in four aspects of pre-control of BOF steelmaking process with forecasted off-gas profile were also provided as pre-control examples.

Hybrid static-sensory data modeling for prediction tasks in basic oxygen furnace process

Article 11 November 2022

Self-Attention-Based Convolutional Parallel Network: An Efficient Multi-Input Deep Learning Model for Endpoint Prediction of High-Carbon BOF Steelmaking

Article 19 August 2024

Attention mechanism-based deep learning for heat load prediction in blast furnace ironmaking process

Article 23 March 2023

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

BOF steelmaking is the most widely used steelmaking method in the world. In this process, the gas content curves of CO and CO₂ in the off-gas profile are of critical importance. These two curves directly reflect the primary steelmaking reaction (the carbon–oxygen reaction) and serve as essential references for dynamic control in BOF steelmaking. Since the CO + CO₂ curve is minimally affected by the reaction between CO and pipeline oxygen, it also serves as an important reference for dynamic BOF control. Additionally, real-time monitoring of the CO curve is crucial in the gas recovery system to prevent explosions. However, due to the need for off-gas cooling or structural constraints within the plant, most gas composition probes are positioned some distance above the BOF vessel mouth. This results in delays in detecting and displaying the gas composition, creating significant challenges for dynamic steelmaking control and safe gas recovery based on the off-gas profile. In some plants, this delay can reach up to 1 min. Forecasting the curves of CO, CO₂, and CO + CO₂ in the off-gas profile not only eliminates the negative impact of delays but also serves as a forecast of the carbon–oxygen reactions themselves, since these curves directly reflect those reactions. Using the forecasted curves, pre-control of the steelmaking process can be implemented. Currently, there are no public precedents for forecasting BOF steelmaking's off-gas profile and using forecasted curves in such profile to pre-control a BOF steelmaking process. This study fills the gap in forecasting off-gas profiles and dynamic control based on forecasted curves, offering a novel concept with significant originality. In this study, a method for forecasting the curves of CO, CO₂, and CO + CO₂ in the off-gas profile was proposed, and accurate forecasting were achieved. It also introduces several approaches for dynamic pre-control based on the forecasted curves.

In fact, IIn fact, the off-gas profile is a specialized time series, with each time-step containing instantaneous values for various gas content percentages. Forecasting the off-gas profile can be abstracted as a short-term time-series forecasting task. In recent years, NLP-based deep-learning technologies have increasingly been applied to time-series tasks. Bahdanau et al. introduced the use of seq2seq attention-based Long short-term memory (LSTM)/gated recurrent unit (GRU) (2014) [1] for time-series processing. For processing time-series, Ashish et al. proposed the Transformer model (2017) [2], while Tan et al. employed an improved convolutional recurrent neural network (2018) [3]. Compared to traditional machine learning models, these approaches offer higher robustness and greater capacity for handling complex time series. Additionally, with pre-trained models, deep-learning techniques facilitate easier deployment and transfer learning.

(1)
Beyond the theoretical research mentioned, time-series forecasting techniques are also widely applied to address practical problems. For example, Gasparin et al. achieved electric load forecasting (2022) [4] with deep networks, Cao et al. used LSTM for financial time-series forecasting (2020) [5], and Prakarsha et al. developed ANN model for biomedical signal forecasting (2022) [6]. These studies demonstrate the significant application potential of deep-learning models and time-series forecasting techniques. However, applying them directly to forecasting the curves in off-gas profile in BOF steelmaking presents substantial challenges: Inefficient backbone. These methods primarily use naive ANN or LSTM models, which have been shown in surveys [7, 8] to perform worse than state-of-the-art models. To improve the backbone, new AI techniques such as residual connections and attention mechanisms should be employed. SOTA models should be tested or used for benchmarks. Additionally, some studies did not conduct ablation experiments, making it unclear whether the adopted modules actually improve accuracy.
(2)
Different data format: Some methods involve data with only a single set of curves, but with many time-steps. For example, the ETT-small dataset introduced by Zhou et al. (2020) [9] consists of data collected from July 2016 to July 2018. Unlike these datasets, off-gas profile data consist of many heats, but each heat has only about a thousand time-steps. Therefore, different data pre-processing methods are required.
(3)
Different data characteristics: Unlike the data used in the aforementioned studies, the curves in the off-gas profile exhibit nearly no seasonality and cyclicality. According to field experience and research [10,11,12,13], the curves in the off-gas profile are related to endpoint conditions but are subject to many unpredictable sudden changes, especially during adjustments in blowing practices and additives. Therefore, more efficient models are needed.

By achieving accurate forecasting of key indicators, the proposed method can be applied to real-time control and pre-control of the BOF process, as well as to the control of the gas recovery system. To address the above challenges and accurately forecast the CO, CO₂, and CO + CO₂ curves in the off-gas profile, this study undertakes the following key tasks:

(i)
The concept of off-gas profile forecasting was proposed. A statistical analysis of the off-gas profile data was conducted, and based on the data characteristics, a new data pre-processing method, termed the mixed-batch method, was introduced.
(ii)
A channel and time-step attention mechanism module was designed, along with a deep-learning algorithm that incorporates AI techniques such as aggregation structures, causal dilation convolution, attention mechanisms, and residual connections. Using this algorithm, accurate forecasting of the CO, CO₂, and CO + CO₂ curves in the off-gas profile was achieved.
(iii)
The concept of using forecasted curves for pre-control in BOF steelmaking was proposed, along with several examples. Attention weights were used to quantify the importance of channels and time-steps in the off-gas profile data.
(iv)
The methods proposed in this work have been applied to external validation. The results were proposed. A corresponding forecasting tool was developed.

The rest of this paper is organized as follows: Sect. 2 introduces the task of off-gas profile forecasting, gives description of off-gas profile data, mixed-batch method, targets and evaluation metrics, Sect. 3 introduces the structures of the proposed model, Sect. 4 shows the results, Sect. 5 gives the examples of pre-control methods, Sect. 6 shows the results of external validation, and Sect. 7 is the conclusion.

2 Preliminaries

2.1 Short-Term Time-Series Forecasting Task

With a rolling side window of fixed ${N}_{x}$-size and a fixed time interval $\omega$, the input ${\chi }^{t}=\left\{{x}_{1}^{t}, {x}_{2}^{t}, {x}_{3}^{t},\dots \dots ,{x}_{{N}_{x}-1}^{t},{x}_{{N}_{x}}^{t} |{x}_{i}^{t} \in {\mathbb{R}}^{{d}_{x}}\right\}$ at time-step $t$ is received, and the target is to forecast corresponding value ${y}^{t+\omega } \in {\mathbb{R}}^{{d}_{y}}$ at time-step $t+$ $\omega$. The interval between each time-step is 1 s according to LOMAS sampling frequency. In this work, the $\omega$ is fixed to 32 according to the BOF workshop requirements, which means that the forecasting model receives the input ${\chi }^{t}$ including data from a ${N}_{x}$-seconds side window and forecast future value ${\gamma }^{t+32}$ that will occur at the next 32 s. The models for ${N}_{x}=8$, ${N}_{x}=16,$ and ${N}_{x}=32$ were developed for comparison.

2.2 Data Description

The dataset contains independent time-series with seven channels from 5198 heats. 60% of heats was divided to training set randomly, 20% to validation set, and 20% to test set. Each channel is a curve that is taken during the descent and elevation of the oxygen lance. The curves are total off-gas flow (Flow), oxygen lance height (Lc-height), oxygen cumulative consumption (O₂-blown), and content percentage of carbon monoxide (cp-CO), carbon dioxide (cp-CO₂), hydrogen (cp-H₂), and oxygen (cp-O₂). According to the sampling frequency of the off-gas recovery system, the time-step interval is 1 s. All channels are normalized with mean and variance of train set. The following is a quantitative and visual description. Table 1 is a statistical description of the values for each channel. And Fig. 1 shows the visualization results of each curve in the time-series data of a typical heat.

Table 1 Descriptive statistics of channels

Full size table

2.3 Mixed-Batch Method

On this multi-scenario (different heat), low time-series length condition, a rolling ${N}_{x}+32$ time-steps slide-window is used to process time-series in train and validation set. Each time it slides, a new subsequence is produced. The first ${N}_{x}$ time-steps of the subsequence are inputs, and the values in the last time-step is target. In the training and validation process, such sub-sequences were randomly put in batches with no replacement. In particular, the mixed-batch operation was not carried out for the time-series in test set. Instead, the time-series of each heat in test set was processed by sliding window and all sub-sequences in this heat was input to model to forecast for whole curves. Figure 2 is a schematic diagram of the input and output of the models.

2.4 Target

Curves of cp-CO and cp-CO₂. They directly represent decarbonization reactions.

Curve of content percentage of carbon monoxide plus carbon dioxide (cp-CO + CO₂). It is more stable, because the total carbon content is hardly affected by the later reaction. Its features are also directly related to the chemical reaction of the BOF steelmaking process.

2.5 Evaluation Metrics

Coefficient of determination (R²) = $1-\frac{\sum {\left({Y}_{i}-\widehat{Y}\right)}^{2}}{\sum {\left({Y}_{i}-\overline{Y }\right)}^{2}}$ and mean squared error (MSE) = $\frac{1}{n}{\sum }_{i=1}^{n}({{Y}_{i}-\widehat{Y})}^{2}$ were taken as error metrics, where with sample index $i,$ ${Y}_{i}$ is the actual value of the target, $\widehat{Y}$ is the forecasted value, $\overline{Y }$ is the average, and $n$ is the number of the samples.

3 Attention-Based Convolutional Aggregation (ABCA)

ABCA is a deep-learning model proposed for the effective forecasting of the curves of cp-CO, cp-CO_2, and cp-CO + CO₂. Aggregation means the combination of different functional blocks throughout a specially designed architecture. The major SOTA time-series forecasting models such as Informer proposed by Zhou et al. [9] and SCINet [14] proposed by Liu et al. (2022) only considered features of the time domain. However, ABCA implements forecasting of off-gas profiles with both the time and frequency domains, which improves the forecasting accuracy. There are five important parts of the model, which are input block, basic block, down-sampling block, output block, and aggregation architecture. Followings are detailed description of all parts of ABCA.

3.1 Input Block

The input block is proposed to aggregate features of frequency domain and time domain and extract features initially. In the input block, according to Eqs. 1 and 2, the input time-series (time domain) is transformed to frequency domain with fast Fourier transform. Equation 1 is the basic Fourier transform formula. $f\left(t\right)$ is an aperiodic function, $F\left(\omega \right)$ is the representation of the function in the frequency domain, ${e}^{-iwt}$ is a complex exponential function, and ω is the angular frequency. Equation 2 is the formula of the fast Fourier transform derived from Eq. 1. ${X}_{k}$ represents the signal in the frequency domain; ${x}_{n}$ represents the signal in the time domain; $i$ is an imaginary number; $k$ is the degree of motion; $N$ represents the size of the data. The frequency-domain sequences are all padded to ${N}_{x}/2$ hertz and normalized

$$F\left( \omega \right) = \int\limits_{{ - \infty }}^{\infty } {f\left( t \right)e^{{ - iwt}} dt}$$

(1)

$$X_{k} = \sum\limits_{{n = 0}}^{{N - 1}} {x_{n} e^{{ - i2\pi kn/N}} .}$$

(2)

Then, the input time-series and frequency-domain sequences are concatenated and embedded with a full convolution layer. Figure 3 shows the structure of an input block.

3.2 Basic Block

Basic block is proposed as the main feature extraction block. It consists of four parts: causal dilation convolution module, attention mechanism module, Layer-Normalization, and residual connection.

Temporal Convolutional Network (TCN) proposed by Lea et al. [15] is a time-series processing model based on one-dimensional convolutional neural network. It introduced causal convolution and dilation convolution. Causal convolution guarantees that it is a one-way model, because the input of any layer at time $t$ only depends on the previous layer outputs at time t and before, which is expressed as Eq. 3. $p\left(x\right)$ is the final output and $p({x}_{t}|{x}_{1}{x}_{2}\dots \dots ,{x}_{t-1})$ is output of the previous layers

$$p\left(x\right)= \prod\limits_{t=1}^{T}p({x}_{t}|{x}_{1},{x}_{2}\dots \dots ,{x}_{t-1}).$$

(3)

Dilation convolution allows the input of convolution to have interval sampling. It introduces the parameter of dilation rate, so that a same size convolution kernel can obtain a larger receptive field. When the input is $x$, the dilation parameter is $d$, and the time-series information is $s$, the dilation convolution function is expressed as Eq. 4.

$$F\left( s \right) = \sum\limits_{i}^{{k - 1}} {f(i)*x_{{s - d*i}} .}$$

(4)

A causal dilation convolution module consists of both causal convolution and dilation convolution, the features will sequentially pass through causal dilation convolutions and their supporting weight normalization operations, Relu activation functions, and dropout layers. Unlike traditional causal dilation convolution layers, the residual connection is after attention module and layer-normalization. Figure 4 shows the structure of modified causal dilation convolution module.

Attention Mechanism. The input to each block is tensors with multistep and multichannel. However, the causal dilation convolution module cannot determine the importance of time-steps and channels and then weight them. The attention mechanism for channels and time-steps is mainly introduced in two-dimensional cases, such as object detection and image classification. Specific implementation methods include SENet [16], CBAM [17], etc. In this work, a one-dimensional attention mechanism for time-step and channel is proposed. It has two connected parts, which are time-step attention part and channel attention part. Their input is tensors in size of c (channels) × n (time-steps).

Time-Step Attention. The input feature maps are processed by one-dimensional average pooling and maximum pooling in channel dimension. The outputs are concentrated in channel dimension and then processed by a one-dimensional fully connected convolution layer (CNN) for down-sampling of channels. After sigmoid operation, an n time-steps attention matrix is received.

Channel Attention. The input feature maps are processed by one-dimensional average pooling and maximum pooling operations in time-step dimension. Their outputs are then extended and squeezed by a feed-forward network that consists of one-dimensional CNN. The squeezed feature maps are added together and then transformed by sigmoid function to receive a c channels attention matrix. Such matrix is an exact weight-map for every channel of original feature map. Figure 5 is the structure of attention modules.

In a basic block, the causal dilation convolution module is connected to the channel and time-step attention modules in sequence. The initial input is processed by a 1 × 1 convolution kernel and residually connected to the output of time-step attention block and normalized by a layer-normalization layer. Figure 6 is the structure of a basic block.

3.3 Down-Sampling Block

A down-sampling block is proposed to continue to extract higher dimensional abstract features and reduce the dimension of parallel input. Channel attention module, time-step attention module and feed-forward convolutional neural network are arranged in sequence in it. The inputs are first concatenated and then processed by a convolutional layer to implement dimensionality reduction. Then, the inputs are assigned weights by attention modules and processed by feed-forward convolutional neural network. The dimensionally reduced original input is residually connected to feed-forward convolutional neural network's output. The residual is normalized by a layer-normalization layer to get final output. Figure 7 is the structure of a down-sampling block.

3.4 Output Block

The output block is proposed to transform the final feature map to the output. It consists of a convolution layer and a fully connected layer (feed-forward layer). The input dimension is first reduced by the convolutional layer, and then flattened. The flattened tensor is transformed to a single output by the fully connected layers. Figure 8 is structure of an output block.

3.5 Aggregation Structure

Aggregation structure merges blocks in a parallel pyramid network to combine and extract features in each channel. With Aggregation structure layers are combined to learn more abundant and combinations that across more of the feature hierarchy. Figure 9 describes the network structure of aggregation structure.

4 Results and Analysis

4.1 Results of Comparison Experiments

Firstly, the performance of ABCA for all targets, based on different input time-step lengths (${N}_{x}$), is detailed in Tables 2, 3, 4. Additionally, several SOTA time-series forecasting and classification backbones were set as benchmarks. These models include SCINet [14], D-linear proposed by Zeng et al. [18], Autoformer proposed by Wu et al. [19], and FEDformer proposed by Zhou et al. [20].

Table 2 Forecasting accuracy for different models (${N}_{x}=8$)

Full size table

Table 3 Forecasting accuracy for different models (${N}_{x}=16$)

Full size table

Table 4 Forecasting accuracy for different models (${N}_{x}=32$)

Full size table

The results in Tables 2, 3, 4 demonstrated that the curves in off-gas profile are forecastable, and most efficient models were able to achieve good results. This proved that gas composition in the BOF steelmaking process exhibits significant autocorrelation, meaning that short-term data can effectively reflect long-term trends. Furthermore, gas generation in the steelmaking process exhibits stability at specific stages, allowing short-term forecasts to remain effective over the long term.

The results also showed that ABCA model consistently outperforms other SOTA models across different forecasting scenarios, especially in forecasting cp-CO and cp-CO + CO₂. This indicates that ABCA has the capability to capture complex features and local patterns in time-series, attributed to its unique aggregation structure and attention mechanisms.

According to Tables 2, 3, 4, an input time-step (${N}_{x}$) of 16 yields the best results. This is because shorter time-step inputs help reduce noise and focus on primary trends and signals, but much shorter time-steps may lead to insufficient information. In the pipeline, CO reacts with oxygen, causing instability in CO₂ levels. This is the main reason for the lower accuracy in forecasting CO₂ curves higher accuracy in forecasting CO + CO₂ curves. The forecasting example for one heat are presented in Fig. 10.

4.2 Results of Ablation Experiments

The ablation experiment was designed to assess the impact of different model configurations on forecasting accuracy, with results detailed in Tables 5, 6, 7.

Table 5 Forecasting accuracy for different models (${N}_{x}=8$)

Full size table

Table 6 Forecasting accuracy for different models (${N}_{x}=16$)

Full size table

Table 7 Forecasting accuracy for different models (${N}_{x}=32$)

Full size table

In the first variant, all attention mechanism modules were removed from the ABCA model, named causal dilation convolution -aggregation (Cdc -Agr). Another variant, referred to as naive convolutional aggregation (NCA), replaced all TCN causal dilation convolution modules with a convolution network block. A third configuration involved connecting the causal dilation convolution modules sequentially without incorporating the aggregation structure, denoted as Cdc. Additionally, a model named Attn-Cdc was created by sequentially connecting the causal dilation convolution modules to the attention mechanism modules, but without the aggregation structure.

The findings reveal that the ABCA model consistently outperforms all other ablation models across various configurations. This demonstrates that each component is indispensable. The inclusion of the aggregation structure in ABCA markedly enhances forecasting accuracy compared to models that rely solely on the attention mechanism. This is evident in the results presented in Table 5 for ${N}_{x}=8$, Table 6 for ${N}_{x}=16$, and Table 7 for ${N}_{x}=32$. The superior performance of the ABCA model underscores the effectiveness of its aggregation structure in improving forecasting accuracy over both sequential configurations and attention-based approaches.

4.3 Results of Importance Analysis

Based on the results presented, an importance analysis was conducted with ${N}_{x}=16$. This analysis involved arranging an attention module and a causal dilation convolution module in sequence to evaluate the significance of each channel and time-step within the input data. The importance is determined by the weights assigned by the attention module. The visualized result for a typical heat is shown in Fig. 11. In Fig. 11, the y-axis shows the sequential batch numbers within a single batch, while the x-axis represents channel names (channels column) and time-step numbers (time-steps columns).

Based on Fig. 11, it can be concluded that for different tasks and stages, the Flow, Lc-height, and O₂-blown curves are all important, with the O₂-blown curve being the most crucial. The gas flow rate (Flow) impacts the oxygen content in the pipeline, which, through the reaction with CO, subsequently affects the levels of CO and CO₂. And the importance of the Lc-height and O₂-blown curves lies in their direct influence on key chemical reactions and temperature control in steel production by altering the blowing practice, ultimately affecting the generation of CO and CO₂.

For the CO and CO + CO₂ curves, the further into the middle of the steelmaking process, the more important the three mentioned curves become. This suggests that these three curves play a key role in influencing the mass transfer process of the primary C–O reactions. Interestingly, the forecasted curves (cp-CO, cp-CO₂) themselves are the least important. This highlights the causal relationship, where changes in external factors, such as blowing practices, lead to changes in the CO and CO₂ curves. Their own changes have little impact on subsequent changes.

For the importance of input time-steps at different stages, it can be observed that later time-steps are always more important than earlier ones, especially at the beginning and near the end of the steelmaking process. This is because the curves fluctuate significantly, and the later time-steps better capture the changing trends.

5 Methods of Pre-controlling BOF Steelmaking with the Forecasted Off-Gas Profile

With forecasted off-gas profile, various pre-control methods of BOF steelmaking can be realized and implemented. There are examples of BOF steelmaking pre-control in different aspects.

5.1 Forecasting BOF Steelmaking Stage

Accurate judgment of steelmaking stage is a basic requirement of many practices. The BOF steelmaking stage can be divided into three stages according to the carbon monoxide content in off-gas, namely, the rising, stable, and declining stage. The disturbance of carbon and oxygen reaction should be eliminated in the rising and declining period to reach a stable steelmaking endpoint, while the adjusting operations such feeding additives are mainly in the stable stage. With off-gas profile forecasting, the timing of the stages can be determined in advance by the forecast curves’ turning point. Therefore, any adjustment operation can be prepared in advance according to the forecast curve. Such as feeding additives, oxygen lance height adjustment, bottom blow adjustment, etc. And Fig. 12 is an example of steelmaking stage.

5.2 Forecasting Steelmaking Reactions

After the oxidation of silicon and manganese in the early (rising) stage, it is particularly important to adjust the additives and oxygen blowing operation with forecast steelmaking reactions in the middle (stable) stage. For example, carbon–oxygen reactions can be forecast by carbon monoxide and carbon dioxide curves. After a sudden rise in the carbon monoxide curve in the off-gas profile for a few seconds, the forecasting curve responds and forecasts a sharp rise in carbon monoxide content. It indicates that the oxidation of the slag is too strong, and the stable carbon–oxygen reaction is disturbed. In this case, the system will pre-control carbon–oxygen reaction through reducing the oxygen blowing rate, increasing the height of the oxygen lance, or adjusting additives.

5.3 Pre-identifying Raw Material Quality

The quality variation of main raw materials will bring changes into off-gas profile features in early (rising) steelmaking stage. If the raw material contains more reducible oxide or unburnt flux than expected, the failure of material quality forecasting and BOF system pre-control will lead to serious consequences. For example, incomplete calcination or improper sealing of the lime will result in more calcium carbonate than expected, which will lead to a large amount of unexpected carbon dioxide being produced during the rising stage of BOF steelmaking and finally lead to slopping. Fortunately, if such problem occurs, a precipitous rise in the carbon dioxide curve will occur in early stage. With application of off-gas profile forecasting model, just a few seconds after the rising of carbon dioxide curve, the model will respond and forecast a sharp rise in the future carbon dioxide content and output the corresponding forecast curve. After receiving such forecasting result, the system can adjust the oxygen blowing operation in advance, and replace the lime silo to avoid slopping in time. Figure 13 is example analysis of the typical reaction disturbances.

5.4 Reducing Response Latency

In many BOF mills, the LOMAS sampler is placed far away from the mouth of BOF vessel. It takes a long time for the off gas to go through the exhaust pipe and be detected by the sampler and sensor, which causes certain latency in the BOF system response. The current model can forecast carbon monoxide and carbon dioxide curve at the next 32 s, which greatly reduces the latency for BOF steelmaking response.

6 Results of External Validation

ABCA model was deployed for external validation. A Python-based tool was programmed to facilitate this process. The tool receives data of 7 channels and 16 time-steps every second. The input is then normalized, converted into tensors, and fed into the ABCA model for forecasting.

The tool's human–machine interaction (HMI) interface allows users to easily select and display both real-time and forecasted curves. In case of an unexpected event, the system includes an Alert Report feature that enables users to manually record abnormal phenomena for further analysis. Figure 14 shows the HMI interface of the off-gas forecasting system.

After applying the ABCA model for external validation, its forecasting accuracy was recorded over 88 consecutive heats. Figure 15 presents the running charts showing the accuracy of the field production test results.

7 Conclusion

This study accomplishes the forecasting of CO, CO₂, and CO + CO₂ curves in BOF steelmaking to eliminate the display delay of these curves, while simultaneously achieving pre-control values for steelmaking based on the forecasted curves. Based on the research findings, the following conclusions can be drawn:

1.
The three curves in the off-gas profile were proven to be forecastable. And the ABCA model proposed in this study demonstrates superior forecasting accuracy compared to other benchmarks. Specifically, its R² values for the forecast of the CO, CO₂, and CO + CO₂ curves are 0.9386, 0.8566, and 0.9428, respectively, while the MSE values are 47.3884, 11.9314, and 54.3583, respectively. Ablation experiments have demonstrated that modifications to the ABCA model reduces its performance.
2.
The forecasted curves can address the issue of display delays, while utilizing the features of the forecasted curves enables pre-control in BOF steelmaking. Four examples are provided to illustrate these two aspects.
3.
The input curves related to off-gas flow rate and blowing practice are crucial in the forecasting process, with the most significant being the oxygen cumulative consumption curve; The later time-steps of the input curves are of greater importance in the forecasting process.
4.
In external validation, the performance of ABCA closely mirrors its results on the test set, demonstrating strong generalization capability and robustness.

Data availability

The data involved in this study are not available due to group requests.

References

Bahdanau, D., Cho, K., Bengio, Y.: arXiv preprint, 2014, arXiv:1409.0473.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A., Polosukhin, N.: Advances in neural information processing systems, 2017, vol.30.
Tan, K., Wang, D.: Interspeech 2018, 3229–3233 (2018)
Google Scholar
Gasparin, A., Lukovic, S., Alippi, C.: CAAI Trans. Intell. Technol. 7(1), 1–25 (2022)
Article Google Scholar
Cao, J., Li, Z., Li, J.: Phys. A 519, 127–139 (2019)
Article Google Scholar
Prakarsha K R and Sharma G: Biomedical Signal Processing and Control, 2022, vol. 76.
Lim, B., Zohren, S.: Time-series forecasting with deep learning: a survey. Phil. Trans. R. Soc. A 379(2194), 20200209 (2021)
Article MathSciNet Google Scholar
Ismail Fawaz, H., Forestier, G., Weber, J., Idoumghar, L., Muller, P.A.: Deep learning for time-series classification: a review. Data Min. Knowl. Disc. 33(4), 917–963 (2019)
Article MathSciNet Google Scholar
Zhou, H., Zhang, S., Peng, J., Zhang, S., Li, J., Xiong, H., Zhang, W.: Proc. AAAI Conf. Artif. Intell. 35(12), 11106–11115 (2021)
Google Scholar
Li, S., Wei, X., Yu, L.: Fuel 90(4), 1350–1360 (2011)
Article Google Scholar
Lytvynyuk, Y., Schenk, J., Hiebler, M., Sormann, A.: Steel Res. Int. 85(4), 544–563 (2011)
Article Google Scholar
Lin, W., Sun, J., Zhou, K., et al.: Mater. Sci. Eng. 668, 5 (2019)
Google Scholar
Sun, S., Liao, D.S., Pyke, N., et al.: Iron Steel Technol. 5(11), 36–42 (2008)
Google Scholar
Liu, M., Zeng, A., Chen, M., Xu, Z., Lai, Q., Ma, L., Xu, Q.: Adv. Neural. Inf. Process. Syst. 35, 5816–5828 (2022)
Google Scholar
Lea, C., Flynn, M.D., Vidal, R., Reiter, A., Hager, G.D.: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1st ed. IEEE, New York, NY, 2017, pp. 1003–1012.
Hu, J., Shen, L., Sun, G.: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1st ed., IEEE, New York, NY, 2018: pp.7132–7141.
Woo, S., Park, J., Lee, J.Y., & Kweon, I.S.: Lecture Notes in Computer Science, 1st ed., Computer Vision, Berlin, Germany, 2018, vol.11211, pp. 3–19.
Zeng, A., Chen, M., Zhang, L., Xu, Q.: arXiv preprint, 2022, arXiv:2205.13504.
Wu, H., Xu, J., Wang, J., & Long, M.: Advances in Neural Information Processing Systems, Neural Information Processing Systems, Lajolla, California, 2021, vol. 34, pp. 22419–22430.
Zhou, T., Ma, Z., Wen, Q., Wang, X., Sun, L., Jin, R.: International Conference on Machine Learning. JMLR-Journal Machine Learning Research, San Diego, California, 2022, vol, 162, pp. 27268–27286.

Download references

Acknowledgements

Thanks are given to Tangsteel Co., Ltd. of Hesteel Group and Digital Co., Ltd. of Hesteel Group for providing detailed data, hardware and software support for model development and field production test. Thanks are also given to Dr. Yun-fei Zhang and Dr. Yang Li for their technical and communication support for this work.

Funding

This work was financially supported by the National Key Research and Development Plan under grant number 2023YFB3712400 and the Natural Science Foundation of Hebei Province under grant number E2022318002.

Author information

Authors and Affiliations

Institute of Engineering Technology, University of Science and Technology Beijing, Beijing, 100083, China
Tian-yi Xie
Material Technology Research Institute, Hesteel Group, Shijiazhuang City, 050023, China
Lan-jie Li & Fei Zhang
Tangsteel Company, Hesteel Group, Tangshan City, 063611, China
Jun-guo Zhang
Metallurgical Technology Research Institute, Central Iron and Steel Research Institute Co., Ltd, Beijing, 100081, China
Fei Zhang
School of Artificial Intelligence, Beijing Technology and Business University, Beijing, 100048, China
Shuai Liu
School of Metallurgical and Ecological Engineering, University of Science & Technology Beijing, Beijing, 100083, China
Han-jie Guo

Authors

Tian-yi Xie
View author publications
You can also search for this author inPubMed Google Scholar
Jun-guo Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Lan-jie Li
View author publications
You can also search for this author inPubMed Google Scholar
Fei Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Shuai Liu
View author publications
You can also search for this author inPubMed Google Scholar
Han-jie Guo
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

Tian-yi Xie proposed the idea for the study and was fully responsible for data pre-processing, model design, development, and deployment. And Tian-yi Xie is also responsible for the writing of this paper. Jun-guo Zhang provided the data support and the model application platform. Lan-jie Li provided theoretical support for the steelmaking process and was responsible for communication. Fei Zhang provided funding of this study and guidance on the principles of steelmaking. Shuai Liu provided technical support of artificial intelligence and gave guidance on the writing of the paper. Han-jie Guo provided guidance on the principles of steelmaking and the knowledge of on-site steelmaking’s process.

Corresponding author

Correspondence to Shuai Liu.

Ethics declarations

Competing interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Xie, Ty., Zhang, Jg., Li, Lj. et al. Attention-Based Convolutional Aggregation: An Efficient Model for Off-Gas Profile Forecasting and Dynamic Pre-Control of BOF Steelmaking. Int J Comput Intell Syst 17, 312 (2024). https://doi.org/10.1007/s44196-024-00713-3

Download citation

Received: 06 March 2024
Accepted: 27 November 2024
Published: 23 December 2024
DOI: https://doi.org/10.1007/s44196-024-00713-3

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Attention-Based Convolutional Aggregation: An Efficient Model for Off-Gas Profile Forecasting and Dynamic Pre-Control of BOF Steelmaking

Abstract

Similar content being viewed by others

Hybrid static-sensory data modeling for prediction tasks in basic oxygen furnace process

Self-Attention-Based Convolutional Parallel Network: An Efficient Multi-Input Deep Learning Model for Endpoint Prediction of High-Carbon BOF Steelmaking

Attention mechanism-based deep learning for heat load prediction in blast furnace ironmaking process

Explore related subjects

1 Introduction

2 Preliminaries

2.1 Short-Term Time-Series Forecasting Task

2.2 Data Description

2.3 Mixed-Batch Method

2.4 Target

2.5 Evaluation Metrics

3 Attention-Based Convolutional Aggregation (ABCA)

3.1 Input Block

3.2 Basic Block

3.3 Down-Sampling Block

3.4 Output Block

3.5 Aggregation Structure

4 Results and Analysis

4.1 Results of Comparison Experiments

4.2 Results of Ablation Experiments

4.3 Results of Importance Analysis

5 Methods of Pre-controlling BOF Steelmaking with the Forecasted Off-Gas Profile

5.1 Forecasting BOF Steelmaking Stage

5.2 Forecasting Steelmaking Reactions

5.3 Pre-identifying Raw Material Quality

5.4 Reducing Response Latency

6 Results of External Validation

7 Conclusion

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords