Abstract
Fault detection is one of the most important research topics to guarantee safe operation and product quality consistency especially in the batch process of semiconductor manufacturing. However, the imbalanced fault data bring great challenges to extract the high nonlinearity and inherently time-varying dynamics of the batch process. Motivated by these, we propose a sequential oversampling discrimination approach for imbalanced batch process fault detection. Especially, different from the traditional oversampling methods, which extract temporal features from the whole process, we transform a whole batch sequence into multiple fixed-length sequences each batch by a sliding window, to extract the robust time-varying dynamics features. Then, an oversampling neural network is performed to balance both sequences of minority and majority classes. The needed sequences of the minority class are generated by an improved combination model of variational auto-encoder and generative adversarial network. Finally, a simplified sequential neural network is learned by the balanced-class sequences to perform the discrimination. We conduct extensive experiments based on two datasets of semiconductor manufacturing. One is a benchmark dataset and the other is a dataset from a real production line. The results achieved significant improvement, compared with other state-of-art fault detection methods and oversampling techniques.

















Similar content being viewed by others
References
Archive, E. (1999). Lam 9600 metal etch data for fault detection evaluation. Retrieved October 15, 2019, from http://www.eigenvector.com/data/Etch/.
Bianchi, F. M., Maiorino, E., Kampffmeyer, M.C., Rizzi, A., & Jenssen, R. (2017). Recurrent neural networks for short-term load forecasting. Springer Briefs in Computer Science. Springer. https://doi.org/10.1007/978-3-319-70338-1.
Cabrera, D., Sancho, F., Long, J., Sánchez, R., Zhang, S., Cerrada, M., et al. (2019). Generative adversarial networks selection approach for extremely imbalanced fault diagnosis of reciprocating machinery. IEEE Access, 7, 70643–70653.
Cao, H., Li, X. L., Woon, D. Y. K., & Ng, S. K. (2013). Integrated oversampling for imbalanced time series classification. IEEE Transactions on Knowledge and Data Engineering, 25(12), 2809–2822.
Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). Smote: synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321–357.
Chen, Z., Cao, Y., Ding, S. X., Zhang, K., Koenings, T., Peng, T., et al. (2019). A distributed canonical correlation analysis-based fault detection method for plant-wide process monitoring. IEEE Transactions on Industrial Informatics, 15(5), 2710–2720.
Cheng, F., He, Q. P., & Zhao, J. (2019). A novel process monitoring approach based on variational recurrent autoencoder. Computers & Chemical Engineering, 129, 106515.
Chollet, F., et al. (2015). Keras. Retrieved October 20, 2019, from https://github.com/fchollet/keras.
Chung, J., Gulcehre, C., Cho, K., & Bengio, Y. (2014). Empirical evaluation of gated recurrent neural networks on sequence modeling. Neural and Evolutionary Computing.
Fajardo, V.A., Findlay, D., Houmanfar, R., Jaiswal, C., Liang, J., & Xie, H. (2018). Vos: A method for variational oversampling of imbalanced data. Machine Learning.
Feng, Y., Zhou, M., & Tong, X. (2020). Imbalanced classification: An objective-oriented review. Methodology.
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., et al. (2014). Generative adversarial nets. Advances in Neural Information Processing Systems, 27, 2672–2680.
Haixiang, G., Yijing, L., Shang, J., Mingyun, G., Yuanyue, H., & Bing, G. (2017). Learning from class-imbalanced data. Expert Systems With Applications, 73(73), 220–239.
He, H., Bai, Y., Garcia, E.A., & Li, S. (2008). Adasyn: Adaptive synthetic sampling approach for imbalanced learning. In 2008 IEEE international joint conference on neural networks (IEEE world congress on computational intelligence) (pp. 1322–1328). IEEE.
Jiang, Q., Yan, S., Yan, X., Yi, H., & Gao, F. (2019). Data-driven 2d deep correlated representation learning for nonlinear batch process monitoring. IEEE Transactions on Industrial Informatics.
Johnson, J. M., & Khoshgoftaar, T. M. (2019). Survey on deep learning with class imbalance. Journal of Big Data, 6(1), 27.
Kingma, D. P., & Welling, M. (2019). An introduction to variational autoencoders. Foundations and Trends in Machine Learning, 12(4), 307–392.
Lee, Y. O., Jo, J., & Hwang, J. (2017). Application of deep neural network and generative adversarial network to industrial maintenance: A case study of induction motor fault detection. In 2017 IEEE international conference on big data (big data) (pp. 3248–3253). IEEE.
Leevy, J. L., Khoshgoftaar, T. M., Bauder, R. A., & Seliya, N. (2018). A survey on addressing high-class imbalance in big data. Journal of Big Data, 5(1), 42.
Lemaître, G., Nogueira, F., & Aridas, C. K. (2017). Imbalanced-learn: A python toolbox to tackle the curse of imbalanced datasets in machine learning. Journal of Machine Learning Research, 18(17), 1–5.
Li, M., Xiong, A., Wang, L., Deng, S., & Ye, J. (2020). Aco resampling: Enhancing the performance of oversampling methods for class imbalance classification. Knowledge-Based Systems, 196, 105818.
Ling, C. X., & Sheng, V. S. (2008). Cost-sensitive learning and the class imbalance problem. Encyclopedia of Machine Learning, 2011, 231–235.
Liu, C., Zhang, L., Niu, J., Yao, R., & Wu, C. (2020). Intelligent prognostics of machining tools based on adaptive variational mode decomposition and deep learning method with attention mechanism. Neurocomputing.
Liu, Z., Zhang, D., Jia, W., Lin, X., & Liu, H. (2020). An adversarial bidirectional serial–parallel LSTM-based QTD framework for product quality prediction. Journal of Intelligent Manufacturing, 31(6), 1511–1529.
Lopez, V., Fernandez, A., Garcia, S., Palade, V., & Herrera, F. (2013). An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics. Information Sciences, 250(250), 113–141.
Luo, J., Huang, J., & Li, H .(2020). A case study of conditional deep convolutional generative adversarial networks in machine fault diagnosis. Journal of Intelligent Manufacturing.
Mao, W., Liu, Y., Ding, L., & Li, Y. (2019). Imbalanced fault diagnosis of rolling bearing based on generative adversarial network: A comparative study. IEEE Access, 7, 9515–9530.
Peng, P., Zhang, W., Zhang, Y., Xu, Y., Wang, H., & Zhang, H. (2020). Cost sensitive active learning using bidirectional gated recurrent neural networks for imbalanced fault diagnosis. Neurocomputing, 407, 232–245.
Penumuru, D. P., Muthuswamy, S., & Karumbu, P. (2020). Identification and classification of materials using machine vision and machine learning in the context of industry 4.0. Journal of Intelligent Manufacturing, 31(5), 1229–1241.
Rezk, N. M., Purnaprajna, M., Nordstrom, T., & Ul-Abdin, Z. (2020). Recurrent neural networks: An embedded computing perspective. IEEE Access, 8, 57967–57996.
Said, M., Kb, Abdellafou, & Taouali, O. (2020). Machine learning technique for data-driven fault detection of nonlinear processes. Journal of Intelligent Manufacturing, 31(4), 865–884.
Shu, J., Xie, Q., Yi, L., Zhao, Q., Zhou, S., Xu, Z., & Meng, D. (2019). Meta-weight-net: Learning an explicit mapping for sample weighting. In: H. Wallach, H. Larochelle, A. Beygelzimer, F. d’ Alché-Buc, E. Fox, & R. Garnett (Eds.), Advances in neural information processing systems 32 (pp. 1919–1930). Curran Associates, Inc.
Sohn, K., Lee, H., & Yan, X. (2015). Learning structured output representation using deep conditional generative models. In Advances in neural information processing systems (pp. 3483–3491).
Wang, X., Du, Y., Lin, S., Cui, P., Shen, Y., & Yang, Y. (2020). Advae: A self-adversarial variational autoencoder with Gaussian anomaly prior knowledge for anomaly detection. Knowledge-Based Systems, 190, 105187.
Wise, B. M., Gallagher, N. B., Butler, S. W., White, D. D., & Barna, G. G. (1999). A comparison of principal component analysis, multiway principal component analysis, trilinear decomposition and parallel factor analysis for fault detection in a semiconductor etch process. Journal of Chemometrics, 13(3–4), 379–396.
Wu, Q., Ding, K., & Huang, B. (2020). Approach for fault prognosis using recurrent neural network. Journal of Intelligent Manufacturing, 31(7), 1621–1633.
Xu, Q., Lu, S., Jia, W., & Jiang, C. (2020a). Imbalanced fault diagnosis of rotating machinery via multi-domain feature extraction and cost-sensitive learning. Journal of Intelligent Manufacturing, 31(6), 1467–1481.
Xu, Z., Shen, D., Nie, T., & Kou, Y. (2020b). A hybrid sampling algorithm combining m-smote and ENN based on random forest for medical imbalanced data. Journal of Biomedical Informatics, 103465.
Yen, S., & Lee, Y. (2009). Cluster-based under-sampling approaches for imbalanced data distributions. Expert Systems With Applications, 36(3), 5718–5727.
Zhang, S., & Zhao, C. (2019). Slow-feature-analysis-based batch process monitoring with comprehensive interpretation of operation condition deviation and dynamic anomaly. IEEE Transactions on Industrial Electronics, 66(5), 3773–3783.
Zhang, Y., Peng, P., Liu, C., & Zhang, H. (2019). Anomaly detection for industry product quality inspection based on Gaussian restricted Boltzmann machine. In 2019 IEEE international conference on systems, man and cybernetics (SMC) (pp. 1–6). https://doi.org/10.1109/SMC.2019.8914524.
Zhao, J., Jin, J., Chen, S., Zhang, R., Yu, B., & Liu, Q. (2020). A weighted hybrid ensemble method for classifying imbalanced data. Knowledge-Based Systems, 203, 106087.
Zheng, M., Li, T., Zhu, R., Tang, Y., Tang, M., Lin, L., et al. (2020). Conditional Wasserstein generative adversarial network-gradient penalty-based approach to alleviating imbalanced data classification. Information Sciences, 512, 1009–1023.
Zhou, F., Yang, S., Fujita, H., Chen, D., & Wen, C. (2020). Deep learning fault diagnosis method based on global optimization gan for unbalanced data. Knowledge-Based Systems, 187, 104837.
Zhu, J., & Gao, F. (2018). Similar batch process monitoring with orthogonal subspace alignment. IEEE Transactions on Industrial Electronics, 65, 8173–8183.
Zhu, S., & Chollet, F. (2020). Understanding masking and padding. http://keras.io/guides/understanding_masking_and_padding/.
Acknowledgements
This work was supported in part by the National Key R&D Program of China under Grant 2018Y FB1701600.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhang, Y., Peng, P., Liu, C. et al. A sequential resampling approach for imbalanced batch process fault detection in semiconductor manufacturing. J Intell Manuf 33, 1057–1072 (2022). https://doi.org/10.1007/s10845-020-01716-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10845-020-01716-5