Skip to main content
Log in

A layer-wise neural network for multi-item single-output quality estimation

  • Published:
Journal of Intelligent Manufacturing Aims and scope Submit manuscript

Abstract

A layer-wise neural network architecture is proposed for classification and regression of time series data where multiple instances have a single output. This data format is encountered in the manufacturing industry where parts are produced in batches—due to the short production cycle—and labelled as a whole for defects. The end-to-end neural network approach is benchmarked against a previously proposed feature engineering method based upon mean shift clustering and K nearest neighbours with dynamic time warping, and a naive approach of flattening the instances and training a support vector machine. An ablation study is performed on a layer-wise 1D-convolutional neural network (CNN) to understand which of the architectural design choices are critical for prediction performance. Based on a transfer moulding production dataset, it is found that the layer-wise 1D-CNN and multilayer perceptron (MLP) have the best performance across most of the common classification and regression metrics, but the layer-wise MLP has a lower computational cost. Finally, it is shown that the proposed parameter sharing in the dense layers of both networks is key to reducing the number of parameters and improving prediction performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Notes

  1. The purpose of the threshold is to ensure that the cluster is not an anomalous result but rather a distinct pattern associated with zero defects.

  2. The two clusters enable separation of sequences associated with zero and non-zero defects.

References

  • Akbilgic, O., Bozdogan, H., & Balaban, M. E. (2014). A novel hybrid RBF neural networks model as a forecaster. Statistics and Computing, 24(3), 365–375. https://doi.org/10.1007/s11222-013-9375-7.

    Article  Google Scholar 

  • Andrzejak, R. G., Lehnertz, K., Mormann, F., Rieke, C., David, P., & Elger, C. E. (2001). Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: Dependence on recording region and brain state. Physical Review E, 64(6), 061907. https://doi.org/10.1103/PhysRevE.64.061907.

    Article  Google Scholar 

  • Bahdanau, D., Cho, K., & Bengio, Y. (2015). Neural machine translation by jointly learning to align and translate. In 3rd International Conference on Learning Representations (ICLR 2015)—Conference Track Proceedings (pp. 1–15). arxiv:1409.0473.

  • Bennett, K. P., & Bredensteiner, E. J. (2000). Duality and geometry in SVM classifiers. In Proceedings of the 17th International Conference on Machine Learning (ICML 2000) (pp. 57–64). Morgan Kaufmann Publishers Inc.

  • Boser, B. E., Guyon, I. M., & Vapnik, V. N. (1992). A training algorithm for optimal margin classifiers. In Proceedings of the 5th Annual Workshop on Computational Learning Theory (COLT 1992) (pp. 144–152). Association for Computing Machinery. https://doi.org/10.1145/130385.130401.

  • Chen, D., Sain, S. L., & Guo, K. (2012). Data mining for the online retail industry: A case study of RFM model-based customer segmentation using data mining. Journal of Database Marketing & Customer Strategy Management, 19(3), 197–208. https://doi.org/10.1057/dbm.2012.17.

    Article  Google Scholar 

  • Chollet, F., et al. (2015). Keras. Retrieved from https://keras.io.

  • Chu, S., Keogh, E., Hart, D., & Pazzani, M. (2002). Iterative deepening dynamic time warping for time series. In Proceedings of the 2002 SIAM International Conference on Data Mining (SDM 2002) (pp. 195–212). Society for Industrial and Applied Mathematics. https://doi.org/10.1137/1.9781611972726.12.

  • Comaniciu, D., & Meer, P. (2002). Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(5), 603–619. https://doi.org/10.1109/34.1000236.

    Article  Google Scholar 

  • Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297. https://doi.org/10.1007/BF00994018.

    Article  Google Scholar 

  • Dau, H. A., Keogh, E., Kamgar, K., Yeh, C.-C.M., Zhu, Y., Gharghabi, S., Ratanamahatana, C. A., Chen, Y., Hu, B., Begum, N., Bagnall, A., Mueen, A., Batista, G., & Hexagon-M. L. (2019). The UCR time series classification archive. Retrieved from https://www.cs.ucr.edu/~eamonn/time_series_data_2018/.

  • Defferrard, M., Benzi, K., Vandergheynst, P., & Bresson, X. (2017). FMA: A dataset for music analysis. In Proceedings of the 18th International Society for Music Information Retrieval Conference (ISMIR 2017) pp. 316–323.

  • Drucker, H., Surges, C. J. C., Kaufman, L., Smola, A., & Vapnik, V. (1996). Support vector regression machines. In M. C. Mozer, M. I. Jordan, & T. Petsche (Eds.), Advances in Neural Information Processing Systems (Vol. 9, pp. 155–161). MIT Press.

  • Dua, D., & Graff, C. (2017). UCI machine learning repository. Retrieved from http://archive.ics.uci.edu/ml.

  • Ferreira, R. P., Martiniano, A., Ferreira, A., Ferreira, A., & Sassi, R. J. (2016). Study on daily demand forecasting orders using artificial neural network. IEEE Latin America Transactions, 14(3), 1519–1525. https://doi.org/10.1109/TLA.2016.7459644.

    Article  Google Scholar 

  • Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2014) (pp. 580–587). Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/CVPR.2014.81.

  • Hébrail, G., Hugueney, B., Lechevallier, Y., & Rossi, F. (2010). Exploratory analysis of functional data via clustering and optimal segmentation. Neurocomputing, 73(7–9), 1125–1141. https://doi.org/10.1016/j.neucom.2009.11.022.

    Article  Google Scholar 

  • Helwig, N., Pignanelli, E., & Schütze, A. (2015). Condition monitoring of a complex hydraulic system using multivariate statistics. In 2015 IEEE International Instrumentation and Measurement Technology Conference Proceedings (I2MTC 2015) (pp. 210–215). Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/I2MTC.2015.7151267.

  • Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735.

    Article  Google Scholar 

  • Hua, J., Xiong, Z., Lowey, J., Suh, E., & Dougherty, E. R. (2005). Optimal number of features as a function of sample size for various classification rules. Bioinformatics, 21(8), 1509–1515. https://doi.org/10.1093/bioinformatics/bti171.

    Article  Google Scholar 

  • Ismail Fawaz, H., Forestier, G., Weber, J., Idoumghar, L., & Muller, P.-A. (2019). Deep learning for time series classification: A review. Data Mining and Knowledge Discovery, 33(4), 917–963. https://doi.org/10.1007/s10618-019-00619-1.

    Article  Google Scholar 

  • Kaluža, B., Mirchevska, V., Dovgan, E., Luštrek, M., & Gams, M. (2010). An agent-based approach to care in independent living. In: B. de Ruyter, R. Wichert, D. V. Keyson, P. Markopoulos, N. Streitz, M. Divitini, N. Georgantas, & A. M. Gomez (Eds.) Ambient Intelligence. AmI 2010. Lecture Notes in Computer Science (Vol. 6439, pp. 177–186). Springer. https://doi.org/10.1007/978-3-642-16917-5_18.

  • Kate, R. J. (2016). Using dynamic time warping distances as features for improved time series classification. Data Mining and Knowledge Discovery, 30(2), 283–312. https://doi.org/10.1007/s10618-015-0418-x.

    Article  Google Scholar 

  • Kawala, F., Douzal-Chouakria, A., Gaussier, E., & Dimert, E. (2013). Prédictions d’activité dans les réseaux sociaux en ligne [Activity predictions in online social networks]. In 4ième Conférence sur les Modèles et L’analyse des Réseaux: Approches Mathématiques et Informatiques [The 4th Conference on Network Modeling and Analysis] (pp. 1–16). Retrieved from https://hal.archives-ouvertes.fr/hal-00881395/document.

  • Kingma, D. P., & Ba, J. L. (2014). Adam: A method for stochastic optimization. In 3rd International Conference on Learning Representations (ICLR 2015)—Conference Track Proceedings (pp. 1–15). http://arxiv.org/abs/1412.6980.

  • LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324. https://doi.org/10.1109/5.726791.

    Article  Google Scholar 

  • Lee, K. J., Yapp, E. K. Y., & Li, X. (2020). Unsupervised probability matching for quality estimation with partial information in a multiple-instances, single-output scenario. In The 15th IEEE Conference on Industrial Electronics and Applications (ICIEA 2020) (pp. 1432–1437). Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/ICIEA48937.2020.9248430.

  • Liang, X., Zou, T., Guo, B., Li, S., Zhang, H., Zhang, S., et al. (2015). Assessing Beijing’s PM 2.5 pollution: Severity, weather impact, APEC and winter heating. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 471(2182), 20150257. https://doi.org/10.1098/rspa.2015.0257.

    Article  Google Scholar 

  • Lipton, Z. C., Kale, D. C., Elkan, C., & Wetzel, R. (2016). Learning to diagnose with LSTM recurrent neural networks. In 4th International Conference on Learning Representations (ICLR 2016)—Conference Track Proceedings (pp. 1–18).

  • Lucas, D. D., Yver Kwok, C., Cameron-Smith, P., Graven, H., Bergmann, D., Guilderson, T. P., et al. (2015). Designing optimal greenhouse gas observing networks that consider performance and cost. Geoscientific Instrumentation, Methods and Data Systems, 4(1), 121–137. https://doi.org/10.5194/gi-4-121-2015.

    Article  Google Scholar 

  • Nanopoulos, A., Alcock, R., & Manolopoulos, Y. (2001). Feature-based classification of time-series data. In N. Mastorakis & S. D. Nikolopoulos (Eds.), Information Processing and Technology (pp. 49–61). Nova Science Publishers Inc.

  • Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., et al. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830 arXiv:1201.0490.

    Google Scholar 

  • Prechelt, L. (1998). Early stopping— But when? In G. B. Orr, & K.-R. Müller (Eds.) Neural Networks: Tricks of the Trade, Lecture Notes in Computer Science (Vol. 1524, pp. 55–69). Springer. https://doi.org/10.1007/3-540-49430-8_3.

  • Rao, P. N. (2018). Manufacturing Technology—Foundry, Forming and Welding (5th ed., Vol. I). McGraw Hill Education.

  • Rodríguez, J. J., & Alonso, C. J. (2004). Interval and dynamic time warping-based decision trees. In Proceedings of the 2004 ACM Symposium on Applied Computing (SAC 2004) (pp. 548–552). Association for Computing Machinery. https://doi.org/10.1145/967900.968015.

  • Rodríguez, J. J., Alonso, C. J., & Boström, H. (2001). Boosting interval based literals. Intelligent Data Analysis, 5(3), 245–262. https://doi.org/10.3233/IDA-2001-5305.

    Article  Google Scholar 

  • Rosato, D. V., Rosato, D. V., & Rosato, M. G. (2000). Injection Molding Handbook, 3 edn. (Vol. I). Springer. https://doi.org/10.1007/978-1-4615-4597-2.

  • Solberg, A. H. S., & Solberg, R. (1996). A large-scale evaluation of features for automatic detection of oil spills in ERS SAR images. In 1996 International Geoscience and Remote Sensing Symposium (IGARSS 1996) (Vol. 3, pp. 1484–1486). Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/IGARSS.1996.516705.

  • Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15, 1929–1958.

    Google Scholar 

  • Sykacek, P., & Roberts, S. (2001). Bayesian time series classification. In T. G. Dietterich, S. Becker, & Z. Ghahramani (Eds.), Advances in Neural Information Processing Systems (Vol. 14, pp. 937–944). MIT Press.

  • Vapnik, V. N., & Lerner, A. Y. (1963). Pattern recognition using generalized portraits. Automation and Remote Control, 24(6), 774–780.

    Google Scholar 

  • Wu, Y., & Chang, E. Y. (2004). Distance-function design and fusion for sequence data. In Proceedings of the 13th ACM Conference on Information and Knowledge Management (CIKM 2004) (pp. 324–333). Association for Computing Machinery. https://doi.org/10.1145/1031171.1031238.

  • Yapp, E. K. Y., Li, X., Lu, W. F., & Tan, P. S. (2020). Comparison of base classifiers for multi-label learning. Neurocomputing, 394, 51–60. https://doi.org/10.1016/j.neucom.2020.01.102.

    Article  Google Scholar 

  • Zhang, K., Fan, W., Yuan, X., Davidson, I., & Li, X. (2006). Forecasting skewed biased stochastic ozone days: Analyses and solutions. In 6th International Conference on Data Mining (ICDM 2006) (pp. 753–764). Institute of Electrical and Electronics Engineers.https://doi.org/10.1109/ICDM.2006.73.

Download references

Acknowledgements

This work was supported by the RIE2020 Advanced Manufacturing and Engineering (AME) IAF-PP (A19C1a0018).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Edward K. Y. Yapp.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yapp, E.K.Y., Gupta, A. & Li, X. A layer-wise neural network for multi-item single-output quality estimation. J Intell Manuf 34, 3131–3141 (2023). https://doi.org/10.1007/s10845-022-01995-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10845-022-01995-0

Keywords

Navigation