A sequential resampling approach for imbalanced batch process fault detection in semiconductor manufacturing

Zhang, Yi; Peng, Peng; Liu, Chongdang; Xu, Yanyan; Zhang, Heming

doi:10.1007/s10845-020-01716-5

A sequential resampling approach for imbalanced batch process fault detection in semiconductor manufacturing

Published: 23 November 2020

Volume 33, pages 1057–1072, (2022)
Cite this article

Journal of Intelligent Manufacturing Aims and scope Submit manuscript

Yi Zhang^1,2,
Peng Peng¹,
Chongdang Liu¹,
Yanyan Xu^1,3 &
…
Heming Zhang¹

918 Accesses
12 Citations
1 Altmetric
Explore all metrics

Abstract

Fault detection is one of the most important research topics to guarantee safe operation and product quality consistency especially in the batch process of semiconductor manufacturing. However, the imbalanced fault data bring great challenges to extract the high nonlinearity and inherently time-varying dynamics of the batch process. Motivated by these, we propose a sequential oversampling discrimination approach for imbalanced batch process fault detection. Especially, different from the traditional oversampling methods, which extract temporal features from the whole process, we transform a whole batch sequence into multiple fixed-length sequences each batch by a sliding window, to extract the robust time-varying dynamics features. Then, an oversampling neural network is performed to balance both sequences of minority and majority classes. The needed sequences of the minority class are generated by an improved combination model of variational auto-encoder and generative adversarial network. Finally, a simplified sequential neural network is learned by the balanced-class sequences to perform the discrimination. We conduct extensive experiments based on two datasets of semiconductor manufacturing. One is a benchmark dataset and the other is a dataset from a real production line. The results achieved significant improvement, compared with other state-of-art fault detection methods and oversampling techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial intelligence, cyber-threats and Industry 4.0: challenges and opportunities

Article 04 February 2021

Bearing fault diagnosis base on multi-scale CNN and LSTM model

Article 05 June 2020

An end-to-end machine learning approach with explanation for time series with varying lengths

Article Open access 19 February 2024

References

Archive, E. (1999). Lam 9600 metal etch data for fault detection evaluation. Retrieved October 15, 2019, from http://www.eigenvector.com/data/Etch/.
Bianchi, F. M., Maiorino, E., Kampffmeyer, M.C., Rizzi, A., & Jenssen, R. (2017). Recurrent neural networks for short-term load forecasting. Springer Briefs in Computer Science. Springer. https://doi.org/10.1007/978-3-319-70338-1.
Cabrera, D., Sancho, F., Long, J., Sánchez, R., Zhang, S., Cerrada, M., et al. (2019). Generative adversarial networks selection approach for extremely imbalanced fault diagnosis of reciprocating machinery. IEEE Access, 7, 70643–70653.
Article Google Scholar
Cao, H., Li, X. L., Woon, D. Y. K., & Ng, S. K. (2013). Integrated oversampling for imbalanced time series classification. IEEE Transactions on Knowledge and Data Engineering, 25(12), 2809–2822.
Article Google Scholar
Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). Smote: synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321–357.
Article Google Scholar
Chen, Z., Cao, Y., Ding, S. X., Zhang, K., Koenings, T., Peng, T., et al. (2019). A distributed canonical correlation analysis-based fault detection method for plant-wide process monitoring. IEEE Transactions on Industrial Informatics, 15(5), 2710–2720.
Article Google Scholar
Cheng, F., He, Q. P., & Zhao, J. (2019). A novel process monitoring approach based on variational recurrent autoencoder. Computers & Chemical Engineering, 129, 106515.
Article Google Scholar
Chollet, F., et al. (2015). Keras. Retrieved October 20, 2019, from https://github.com/fchollet/keras.
Chung, J., Gulcehre, C., Cho, K., & Bengio, Y. (2014). Empirical evaluation of gated recurrent neural networks on sequence modeling. Neural and Evolutionary Computing.
Fajardo, V.A., Findlay, D., Houmanfar, R., Jaiswal, C., Liang, J., & Xie, H. (2018). Vos: A method for variational oversampling of imbalanced data. Machine Learning.
Feng, Y., Zhou, M., & Tong, X. (2020). Imbalanced classification: An objective-oriented review. Methodology.
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., et al. (2014). Generative adversarial nets. Advances in Neural Information Processing Systems, 27, 2672–2680.
Google Scholar
Haixiang, G., Yijing, L., Shang, J., Mingyun, G., Yuanyue, H., & Bing, G. (2017). Learning from class-imbalanced data. Expert Systems With Applications, 73(73), 220–239.
Article Google Scholar
He, H., Bai, Y., Garcia, E.A., & Li, S. (2008). Adasyn: Adaptive synthetic sampling approach for imbalanced learning. In 2008 IEEE international joint conference on neural networks (IEEE world congress on computational intelligence) (pp. 1322–1328). IEEE.
Jiang, Q., Yan, S., Yan, X., Yi, H., & Gao, F. (2019). Data-driven 2d deep correlated representation learning for nonlinear batch process monitoring. IEEE Transactions on Industrial Informatics.
Johnson, J. M., & Khoshgoftaar, T. M. (2019). Survey on deep learning with class imbalance. Journal of Big Data, 6(1), 27.
Article Google Scholar
Kingma, D. P., & Welling, M. (2019). An introduction to variational autoencoders. Foundations and Trends in Machine Learning, 12(4), 307–392.
Article Google Scholar
Lee, Y. O., Jo, J., & Hwang, J. (2017). Application of deep neural network and generative adversarial network to industrial maintenance: A case study of induction motor fault detection. In 2017 IEEE international conference on big data (big data) (pp. 3248–3253). IEEE.
Leevy, J. L., Khoshgoftaar, T. M., Bauder, R. A., & Seliya, N. (2018). A survey on addressing high-class imbalance in big data. Journal of Big Data, 5(1), 42.
Article Google Scholar
Lemaître, G., Nogueira, F., & Aridas, C. K. (2017). Imbalanced-learn: A python toolbox to tackle the curse of imbalanced datasets in machine learning. Journal of Machine Learning Research, 18(17), 1–5.
Google Scholar
Li, M., Xiong, A., Wang, L., Deng, S., & Ye, J. (2020). Aco resampling: Enhancing the performance of oversampling methods for class imbalance classification. Knowledge-Based Systems, 196, 105818.
Article Google Scholar
Ling, C. X., & Sheng, V. S. (2008). Cost-sensitive learning and the class imbalance problem. Encyclopedia of Machine Learning, 2011, 231–235.
Google Scholar
Liu, C., Zhang, L., Niu, J., Yao, R., & Wu, C. (2020). Intelligent prognostics of machining tools based on adaptive variational mode decomposition and deep learning method with attention mechanism. Neurocomputing.
Liu, Z., Zhang, D., Jia, W., Lin, X., & Liu, H. (2020). An adversarial bidirectional serial–parallel LSTM-based QTD framework for product quality prediction. Journal of Intelligent Manufacturing, 31(6), 1511–1529.
Article Google Scholar
Lopez, V., Fernandez, A., Garcia, S., Palade, V., & Herrera, F. (2013). An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics. Information Sciences, 250(250), 113–141.
Article Google Scholar
Luo, J., Huang, J., & Li, H .(2020). A case study of conditional deep convolutional generative adversarial networks in machine fault diagnosis. Journal of Intelligent Manufacturing.
Mao, W., Liu, Y., Ding, L., & Li, Y. (2019). Imbalanced fault diagnosis of rolling bearing based on generative adversarial network: A comparative study. IEEE Access, 7, 9515–9530.
Article Google Scholar
Peng, P., Zhang, W., Zhang, Y., Xu, Y., Wang, H., & Zhang, H. (2020). Cost sensitive active learning using bidirectional gated recurrent neural networks for imbalanced fault diagnosis. Neurocomputing, 407, 232–245.
Article Google Scholar
Penumuru, D. P., Muthuswamy, S., & Karumbu, P. (2020). Identification and classification of materials using machine vision and machine learning in the context of industry 4.0. Journal of Intelligent Manufacturing, 31(5), 1229–1241.
Article Google Scholar
Rezk, N. M., Purnaprajna, M., Nordstrom, T., & Ul-Abdin, Z. (2020). Recurrent neural networks: An embedded computing perspective. IEEE Access, 8, 57967–57996.
Article Google Scholar
Said, M., Kb, Abdellafou, & Taouali, O. (2020). Machine learning technique for data-driven fault detection of nonlinear processes. Journal of Intelligent Manufacturing, 31(4), 865–884.
Article Google Scholar
Shu, J., Xie, Q., Yi, L., Zhao, Q., Zhou, S., Xu, Z., & Meng, D. (2019). Meta-weight-net: Learning an explicit mapping for sample weighting. In: H. Wallach, H. Larochelle, A. Beygelzimer, F. d’ Alché-Buc, E. Fox, & R. Garnett (Eds.), Advances in neural information processing systems 32 (pp. 1919–1930). Curran Associates, Inc.
Sohn, K., Lee, H., & Yan, X. (2015). Learning structured output representation using deep conditional generative models. In Advances in neural information processing systems (pp. 3483–3491).
Wang, X., Du, Y., Lin, S., Cui, P., Shen, Y., & Yang, Y. (2020). Advae: A self-adversarial variational autoencoder with Gaussian anomaly prior knowledge for anomaly detection. Knowledge-Based Systems, 190, 105187.
Article Google Scholar
Wise, B. M., Gallagher, N. B., Butler, S. W., White, D. D., & Barna, G. G. (1999). A comparison of principal component analysis, multiway principal component analysis, trilinear decomposition and parallel factor analysis for fault detection in a semiconductor etch process. Journal of Chemometrics, 13(3–4), 379–396.
Article Google Scholar
Wu, Q., Ding, K., & Huang, B. (2020). Approach for fault prognosis using recurrent neural network. Journal of Intelligent Manufacturing, 31(7), 1621–1633.
Article Google Scholar
Xu, Q., Lu, S., Jia, W., & Jiang, C. (2020a). Imbalanced fault diagnosis of rotating machinery via multi-domain feature extraction and cost-sensitive learning. Journal of Intelligent Manufacturing, 31(6), 1467–1481.
Article Google Scholar
Xu, Z., Shen, D., Nie, T., & Kou, Y. (2020b). A hybrid sampling algorithm combining m-smote and ENN based on random forest for medical imbalanced data. Journal of Biomedical Informatics, 103465.
Yen, S., & Lee, Y. (2009). Cluster-based under-sampling approaches for imbalanced data distributions. Expert Systems With Applications, 36(3), 5718–5727.
Article Google Scholar
Zhang, S., & Zhao, C. (2019). Slow-feature-analysis-based batch process monitoring with comprehensive interpretation of operation condition deviation and dynamic anomaly. IEEE Transactions on Industrial Electronics, 66(5), 3773–3783.
Article Google Scholar
Zhang, Y., Peng, P., Liu, C., & Zhang, H. (2019). Anomaly detection for industry product quality inspection based on Gaussian restricted Boltzmann machine. In 2019 IEEE international conference on systems, man and cybernetics (SMC) (pp. 1–6). https://doi.org/10.1109/SMC.2019.8914524.
Zhao, J., Jin, J., Chen, S., Zhang, R., Yu, B., & Liu, Q. (2020). A weighted hybrid ensemble method for classifying imbalanced data. Knowledge-Based Systems, 203, 106087.
Article Google Scholar
Zheng, M., Li, T., Zhu, R., Tang, Y., Tang, M., Lin, L., et al. (2020). Conditional Wasserstein generative adversarial network-gradient penalty-based approach to alleviating imbalanced data classification. Information Sciences, 512, 1009–1023.
Article Google Scholar
Zhou, F., Yang, S., Fujita, H., Chen, D., & Wen, C. (2020). Deep learning fault diagnosis method based on global optimization gan for unbalanced data. Knowledge-Based Systems, 187, 104837.
Article Google Scholar
Zhu, J., & Gao, F. (2018). Similar batch process monitoring with orthogonal subspace alignment. IEEE Transactions on Industrial Electronics, 65, 8173–8183.
Article Google Scholar
Zhu, S., & Chollet, F. (2020). Understanding masking and padding. http://keras.io/guides/understanding_masking_and_padding/.

Download references

Acknowledgements

This work was supported in part by the National Key R&D Program of China under Grant 2018Y FB1701600.

Author information

Authors and Affiliations

Department of Automation, Tsinghua University, Beijing, 100091, China
Yi Zhang, Peng Peng, Chongdang Liu, Yanyan Xu & Heming Zhang
Naval Research Academy, Beijing, 100073, China
Yi Zhang
Unit 94926, Wuxi, 214141, China
Yanyan Xu

Authors

Yi Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Peng Peng
View author publications
You can also search for this author in PubMed Google Scholar
Chongdang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yanyan Xu
View author publications
You can also search for this author in PubMed Google Scholar
Heming Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Heming Zhang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, Y., Peng, P., Liu, C. et al. A sequential resampling approach for imbalanced batch process fault detection in semiconductor manufacturing. J Intell Manuf 33, 1057–1072 (2022). https://doi.org/10.1007/s10845-020-01716-5

Download citation

Received: 25 March 2020
Accepted: 12 November 2020
Published: 23 November 2020
Issue Date: April 2022
DOI: https://doi.org/10.1007/s10845-020-01716-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A sequential resampling approach for imbalanced batch process fault detection in semiconductor manufacturing

Abstract

Access this article

Similar content being viewed by others

Artificial intelligence, cyber-threats and Industry 4.0: challenges and opportunities

Bearing fault diagnosis base on multi-scale CNN and LSTM model

An end-to-end machine learning approach with explanation for time series with varying lengths

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A sequential resampling approach for imbalanced batch process fault detection in semiconductor manufacturing

Abstract

Access this article

Similar content being viewed by others

Artificial intelligence, cyber-threats and Industry 4.0: challenges and opportunities

Bearing fault diagnosis base on multi-scale CNN and LSTM model

An end-to-end machine learning approach with explanation for time series with varying lengths

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation