Skip to main content
Log in

Mining manufacturing databases to discover the effect of operation sequence on the product quality

  • Published:
Journal of Intelligent Manufacturing Aims and scope Submit manuscript

Abstract

Data mining techniques can be used for discovering interesting patterns in complicated manufacturing processes. These patterns are used to improve manufacturing quality. Classical representations of quality data mining problems usually refer to the operations settings and not to their sequence. This paper examines the effect of the operation sequence on the quality of the product using data mining techniques. For this purpose a novel decision tree framework for extracting sequence patterns is developed. The proposed method is capable to mine sequence patterns of any length with operations that are not necessarily immediate precedents. The core induction algorithmic framework consists of four main steps. In the first step, all manufacturing sequences are represented as string of tokens. In the second step a large set of regular expression-based patterns are induced by employing a sequence patterns. In the third step we use feature selection methods to filter out the initial set, and leave only the most useful patterns. In the last stage, we transform the quality problem into a classification problem and employ a decision tree induction algorithm. A comparative study performed on benchmark databases illustrates the capabilities of the proposed framework.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules in large databases, in Proceedings of the International Conference on Large Databases, pp. 478–499.

  • Braha D. and Shmilovici A. (2003). On the use of decision tree induction for discovery of interactions in a photolithographic process. IEEE Transactions on Semiconductor Manufacturing 16(4): 644–652

    Article  Google Scholar 

  • Chizi, B., & Maimon, O. (2005). Dimension reduction and feature selection, the data mining and knowledge discovery handbook. In O. Maimon & L. Rokach (Eds.), (pp. 93–111), Springer.

  • da Cunha C., Agard B. and Kusiak A. (2006). Data mining for improvement of product quality. International Journal of Production Research 44(18–19): 4027–4041

    Article  Google Scholar 

  • Damashek M. (1995). Gauging similarity with n-grams: language independent categorization of text. Science 267(5199): 843–848

    Article  Google Scholar 

  • Frank, E., Hall, M., Holmes, G., Kirkby, R., & Pfahringer, B. (2005). WEKA – A Machine Learning Workbench for Data Mining. In O. Maimon & L. Rokach (Eds.), The data mining and knowledge discovery handbook. Springer, pp. 1305–1314.

  • Freitag, D. (1998) Toward general-purpose learning for information extraction. Proceedings of the thirty-sixth annual meeting of the association for computational linguistics and seventeenth international conference on computational linguistics, pp. 404–408.

  • GNU Diff (2003). Retrieved October 31, 2006 from http://www.bmsi.com/java/#diff.

  • Hall, M. (1999). Correlation-based feature selection for machine learning, Phd Thesis, University of Waikato.

  • Hand D. (1998). Data Mining – reaching beyond statistics. Research in Official Statistics 1(2): 5–17

    Google Scholar 

  • Kusiak A. (2006). Data mining: Manufacturing and service applications. International Journal of Production Research 44(18–19): 4175–4191

    Article  Google Scholar 

  • Lafferty, J., McCallum, A., & Pereira, F. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the eighteenth international conference on machine learning (ICML-2001), pp. 282–289.

  • Myers E.W. (1986). An O(ND) difference algorithm and its variations. Algorithmica 1(1): 251–266

    Article  Google Scholar 

  • Quinlan, J. R. (1993). C4.5: Programs for machine learning. Morgan Kaufmann

  • Rabiner L.R. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77(2): 257–286

    Article  Google Scholar 

  • Rigoutsos I. and Floratos A. (1998). Combinatorial pattern discovery in biological sequences: The TEIRESIAS algorithm. Bioinformatics 14(1): 55–67

    Article  Google Scholar 

  • Rakotomalala, R. (2005). TANAGRA: a free software for research and academic purposes. In Proceedings of EGC’2005, RNTI-E-3, Vol. 2, pp.697–702.

  • Rokach L. (2008). Mining manufacturing data using genetic algorithm-based feature set decomposition. IJISTA 4(1): 57–78

    Article  Google Scholar 

  • Rokach L. and Maimon O. (2005). Top-down induction of decision trees classifiers - a survey. IEEE Transactions on Systems, Man and Cybernetics, Part C 35(4): 476–487

    Article  Google Scholar 

  • Rokach L. and Maimon O. (2006). Data mining for improving the quality of manufacturing: A feature set decomposition approach. Journal of Intelligent Manufacturing 17(3): 285–299

    Article  Google Scholar 

  • Sebastiani F. (2002). Machine learning in automated text categorization. ACM Comp. Surv 34(1): 1–47

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lior Rokach.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rokach, L., Romano, R. & Maimon, O. Mining manufacturing databases to discover the effect of operation sequence on the product quality. J Intell Manuf 19, 313–325 (2008). https://doi.org/10.1007/s10845-008-0084-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10845-008-0084-6

Keywords

Navigation