Skip to main content
Log in

A sequential resampling approach for imbalanced batch process fault detection in semiconductor manufacturing

  • Published:
Journal of Intelligent Manufacturing Aims and scope Submit manuscript

Abstract

Fault detection is one of the most important research topics to guarantee safe operation and product quality consistency especially in the batch process of semiconductor manufacturing. However, the imbalanced fault data bring great challenges to extract the high nonlinearity and inherently time-varying dynamics of the batch process. Motivated by these, we propose a sequential oversampling discrimination approach for imbalanced batch process fault detection. Especially, different from the traditional oversampling methods, which extract temporal features from the whole process, we transform a whole batch sequence into multiple fixed-length sequences each batch by a sliding window, to extract the robust time-varying dynamics features. Then, an oversampling neural network is performed to balance both sequences of minority and majority classes. The needed sequences of the minority class are generated by an improved combination model of variational auto-encoder and generative adversarial network. Finally, a simplified sequential neural network is learned by the balanced-class sequences to perform the discrimination. We conduct extensive experiments based on two datasets of semiconductor manufacturing. One is a benchmark dataset and the other is a dataset from a real production line. The results achieved significant improvement, compared with other state-of-art fault detection methods and oversampling techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

References

  • Archive, E. (1999). Lam 9600 metal etch data for fault detection evaluation. Retrieved October 15, 2019, from http://www.eigenvector.com/data/Etch/.

  • Bianchi, F. M., Maiorino, E., Kampffmeyer, M.C., Rizzi, A., & Jenssen, R. (2017). Recurrent neural networks for short-term load forecasting. Springer Briefs in Computer Science. Springer. https://doi.org/10.1007/978-3-319-70338-1.

  • Cabrera, D., Sancho, F., Long, J., Sánchez, R., Zhang, S., Cerrada, M., et al. (2019). Generative adversarial networks selection approach for extremely imbalanced fault diagnosis of reciprocating machinery. IEEE Access, 7, 70643–70653.

    Article  Google Scholar 

  • Cao, H., Li, X. L., Woon, D. Y. K., & Ng, S. K. (2013). Integrated oversampling for imbalanced time series classification. IEEE Transactions on Knowledge and Data Engineering, 25(12), 2809–2822.

    Article  Google Scholar 

  • Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). Smote: synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321–357.

    Article  Google Scholar 

  • Chen, Z., Cao, Y., Ding, S. X., Zhang, K., Koenings, T., Peng, T., et al. (2019). A distributed canonical correlation analysis-based fault detection method for plant-wide process monitoring. IEEE Transactions on Industrial Informatics, 15(5), 2710–2720.

    Article  Google Scholar 

  • Cheng, F., He, Q. P., & Zhao, J. (2019). A novel process monitoring approach based on variational recurrent autoencoder. Computers & Chemical Engineering, 129, 106515.

    Article  Google Scholar 

  • Chollet, F., et al. (2015). Keras. Retrieved October 20, 2019, from https://github.com/fchollet/keras.

  • Chung, J., Gulcehre, C., Cho, K., & Bengio, Y. (2014). Empirical evaluation of gated recurrent neural networks on sequence modeling. Neural and Evolutionary Computing.

  • Fajardo, V.A., Findlay, D., Houmanfar, R., Jaiswal, C., Liang, J., & Xie, H. (2018). Vos: A method for variational oversampling of imbalanced data. Machine Learning.

  • Feng, Y., Zhou, M., & Tong, X. (2020). Imbalanced classification: An objective-oriented review. Methodology.

  • Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., et al. (2014). Generative adversarial nets. Advances in Neural Information Processing Systems, 27, 2672–2680.

    Google Scholar 

  • Haixiang, G., Yijing, L., Shang, J., Mingyun, G., Yuanyue, H., & Bing, G. (2017). Learning from class-imbalanced data. Expert Systems With Applications, 73(73), 220–239.

    Article  Google Scholar 

  • He, H., Bai, Y., Garcia, E.A., & Li, S. (2008). Adasyn: Adaptive synthetic sampling approach for imbalanced learning. In 2008 IEEE international joint conference on neural networks (IEEE world congress on computational intelligence) (pp. 1322–1328). IEEE.

  • Jiang, Q., Yan, S., Yan, X., Yi, H., & Gao, F. (2019). Data-driven 2d deep correlated representation learning for nonlinear batch process monitoring. IEEE Transactions on Industrial Informatics.

  • Johnson, J. M., & Khoshgoftaar, T. M. (2019). Survey on deep learning with class imbalance. Journal of Big Data, 6(1), 27.

    Article  Google Scholar 

  • Kingma, D. P., & Welling, M. (2019). An introduction to variational autoencoders. Foundations and Trends in Machine Learning, 12(4), 307–392.

    Article  Google Scholar 

  • Lee, Y. O., Jo, J., & Hwang, J. (2017). Application of deep neural network and generative adversarial network to industrial maintenance: A case study of induction motor fault detection. In 2017 IEEE international conference on big data (big data) (pp. 3248–3253). IEEE.

  • Leevy, J. L., Khoshgoftaar, T. M., Bauder, R. A., & Seliya, N. (2018). A survey on addressing high-class imbalance in big data. Journal of Big Data, 5(1), 42.

    Article  Google Scholar 

  • Lemaître, G., Nogueira, F., & Aridas, C. K. (2017). Imbalanced-learn: A python toolbox to tackle the curse of imbalanced datasets in machine learning. Journal of Machine Learning Research, 18(17), 1–5.

    Google Scholar 

  • Li, M., Xiong, A., Wang, L., Deng, S., & Ye, J. (2020). Aco resampling: Enhancing the performance of oversampling methods for class imbalance classification. Knowledge-Based Systems, 196, 105818.

    Article  Google Scholar 

  • Ling, C. X., & Sheng, V. S. (2008). Cost-sensitive learning and the class imbalance problem. Encyclopedia of Machine Learning, 2011, 231–235.

    Google Scholar 

  • Liu, C., Zhang, L., Niu, J., Yao, R., & Wu, C. (2020). Intelligent prognostics of machining tools based on adaptive variational mode decomposition and deep learning method with attention mechanism. Neurocomputing.

  • Liu, Z., Zhang, D., Jia, W., Lin, X., & Liu, H. (2020). An adversarial bidirectional serial–parallel LSTM-based QTD framework for product quality prediction. Journal of Intelligent Manufacturing, 31(6), 1511–1529.

    Article  Google Scholar 

  • Lopez, V., Fernandez, A., Garcia, S., Palade, V., & Herrera, F. (2013). An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics. Information Sciences, 250(250), 113–141.

    Article  Google Scholar 

  • Luo, J., Huang, J., & Li, H .(2020). A case study of conditional deep convolutional generative adversarial networks in machine fault diagnosis. Journal of Intelligent Manufacturing.

  • Mao, W., Liu, Y., Ding, L., & Li, Y. (2019). Imbalanced fault diagnosis of rolling bearing based on generative adversarial network: A comparative study. IEEE Access, 7, 9515–9530.

    Article  Google Scholar 

  • Peng, P., Zhang, W., Zhang, Y., Xu, Y., Wang, H., & Zhang, H. (2020). Cost sensitive active learning using bidirectional gated recurrent neural networks for imbalanced fault diagnosis. Neurocomputing, 407, 232–245.

    Article  Google Scholar 

  • Penumuru, D. P., Muthuswamy, S., & Karumbu, P. (2020). Identification and classification of materials using machine vision and machine learning in the context of industry 4.0. Journal of Intelligent Manufacturing, 31(5), 1229–1241.

    Article  Google Scholar 

  • Rezk, N. M., Purnaprajna, M., Nordstrom, T., & Ul-Abdin, Z. (2020). Recurrent neural networks: An embedded computing perspective. IEEE Access, 8, 57967–57996.

    Article  Google Scholar 

  • Said, M., Kb, Abdellafou, & Taouali, O. (2020). Machine learning technique for data-driven fault detection of nonlinear processes. Journal of Intelligent Manufacturing, 31(4), 865–884.

    Article  Google Scholar 

  • Shu, J., Xie, Q., Yi, L., Zhao, Q., Zhou, S., Xu, Z., & Meng, D. (2019). Meta-weight-net: Learning an explicit mapping for sample weighting. In: H. Wallach, H. Larochelle, A. Beygelzimer, F. d’ Alché-Buc, E. Fox, & R. Garnett (Eds.), Advances in neural information processing systems 32 (pp. 1919–1930). Curran Associates, Inc.

  • Sohn, K., Lee, H., & Yan, X. (2015). Learning structured output representation using deep conditional generative models. In Advances in neural information processing systems (pp. 3483–3491).

  • Wang, X., Du, Y., Lin, S., Cui, P., Shen, Y., & Yang, Y. (2020). Advae: A self-adversarial variational autoencoder with Gaussian anomaly prior knowledge for anomaly detection. Knowledge-Based Systems, 190, 105187.

    Article  Google Scholar 

  • Wise, B. M., Gallagher, N. B., Butler, S. W., White, D. D., & Barna, G. G. (1999). A comparison of principal component analysis, multiway principal component analysis, trilinear decomposition and parallel factor analysis for fault detection in a semiconductor etch process. Journal of Chemometrics, 13(3–4), 379–396.

    Article  Google Scholar 

  • Wu, Q., Ding, K., & Huang, B. (2020). Approach for fault prognosis using recurrent neural network. Journal of Intelligent Manufacturing, 31(7), 1621–1633.

    Article  Google Scholar 

  • Xu, Q., Lu, S., Jia, W., & Jiang, C. (2020a). Imbalanced fault diagnosis of rotating machinery via multi-domain feature extraction and cost-sensitive learning. Journal of Intelligent Manufacturing, 31(6), 1467–1481.

    Article  Google Scholar 

  • Xu, Z., Shen, D., Nie, T., & Kou, Y. (2020b). A hybrid sampling algorithm combining m-smote and ENN based on random forest for medical imbalanced data. Journal of Biomedical Informatics, 103465.

  • Yen, S., & Lee, Y. (2009). Cluster-based under-sampling approaches for imbalanced data distributions. Expert Systems With Applications, 36(3), 5718–5727.

    Article  Google Scholar 

  • Zhang, S., & Zhao, C. (2019). Slow-feature-analysis-based batch process monitoring with comprehensive interpretation of operation condition deviation and dynamic anomaly. IEEE Transactions on Industrial Electronics, 66(5), 3773–3783.

    Article  Google Scholar 

  • Zhang, Y., Peng, P., Liu, C., & Zhang, H. (2019). Anomaly detection for industry product quality inspection based on Gaussian restricted Boltzmann machine. In 2019 IEEE international conference on systems, man and cybernetics (SMC) (pp. 1–6). https://doi.org/10.1109/SMC.2019.8914524.

  • Zhao, J., Jin, J., Chen, S., Zhang, R., Yu, B., & Liu, Q. (2020). A weighted hybrid ensemble method for classifying imbalanced data. Knowledge-Based Systems, 203, 106087.

    Article  Google Scholar 

  • Zheng, M., Li, T., Zhu, R., Tang, Y., Tang, M., Lin, L., et al. (2020). Conditional Wasserstein generative adversarial network-gradient penalty-based approach to alleviating imbalanced data classification. Information Sciences, 512, 1009–1023.

    Article  Google Scholar 

  • Zhou, F., Yang, S., Fujita, H., Chen, D., & Wen, C. (2020). Deep learning fault diagnosis method based on global optimization gan for unbalanced data. Knowledge-Based Systems, 187, 104837.

    Article  Google Scholar 

  • Zhu, J., & Gao, F. (2018). Similar batch process monitoring with orthogonal subspace alignment. IEEE Transactions on Industrial Electronics, 65, 8173–8183.

    Article  Google Scholar 

  • Zhu, S., & Chollet, F. (2020). Understanding masking and padding. http://keras.io/guides/understanding_masking_and_padding/.

Download references

Acknowledgements

This work was supported in part by the National Key R&D Program of China under Grant 2018Y FB1701600.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Heming Zhang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Y., Peng, P., Liu, C. et al. A sequential resampling approach for imbalanced batch process fault detection in semiconductor manufacturing. J Intell Manuf 33, 1057–1072 (2022). https://doi.org/10.1007/s10845-020-01716-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10845-020-01716-5

Keywords

Navigation