Classifying defect factors in fabric production via DIFACONN-miner: A case study

doi:10.1016/j.eswa.2011.02.182

Expert Systems with Applications

Volume 38, Issue 9, September 2011, Pages 11321-11328

https://doi.org/10.1016/j.eswa.2011.02.182 Get rights and content

Abstract

In this paper a data mining based case study is carried out in a major textile company in Turkey in order to classify and analyze the defect factors in their fabric production process. It is aimed to understand the causes of the defects in order to minimize their occurrence. The main motivation behind this study is to minimize scrap loses in the company and enabling more sustainable production via data mining. In the analyses, a data mining tool (DIFACONN-miner) that was recently developed by authors is employed. DIFACONN-miner is a novel data mining tool which combines several metaheuristics and artificial neural networks intelligently and it is capable of producing comprehensive classification rules from any data type.

Research highlights

► In this paper a novel approach is provided to classify quality defects in fabric production. ► A real life case study and its results are provided. ► A novel data mining tool, DIFACONN-miner which was recently provided by the authors was utilized to model and solve the problem. ► It is shown that DIFACONN-miner is able to model and classify quality defects in fabric production.

Introduction

Employing data mining in analyzing manufacturing data can be a great support to sustainable and green manufacturing. This is mainly due to fact that knowledge gathered via data mining (DM) approach can be used to improve manufacturing productivity, reduce scrap rates and minimize reworking. All these improvements mean less material and energy consumption which is a prerequisite for sustainable and green manufacturing practices. Based on this motivation, it is decided to carry out a data mining based study for defect diagnosis in fabric production in a major textile company in Turkey. Several other authors’ from the literature have also recognized that data mining can be a very useful and enabling tool for sustainable manufacturing. For example; Modapothala and Issac (2009) employed data mining for analyzing corporate environmental reports. They conclude that their findings confirm reliable direction in sustainability with the use of data mining techniques. Marwah, Sharma, Ramakrishnan, Bash, and Patel (2009a) pointed out that considering the large amount of data produced by physical ecosystems, manual inspection is virtually impossible and thus automated knowledge discovery and data mining techniques are required to synthesize models for enabling sustainable end-to-end operation and management of physical ecosystems. They defined physical ecosystem as a collection of entities and processes that consume physical resources and fulfill functionality. They proposed a data mining framework which uses sustainability metrics such as energy, power consumption and carbon footprint to quantify the desirability of operating the system. Their aim was to monitor the data patterns and determine if it is possible to transform the current operational state to another more efficient one. They mentioned that, domain knowledge required to move from one state to another, where feasible, can be encoded into actionable rules which can be generated through data mining. Several other similar conclusions can be found in (Bash et al., 2008, Marwah et al., 2009b).

In recent years defect diagnosis has become an important activity for ensuring high quality production. It involves the identification of the defect information, which can then be used to adjust manufacturing process accordingly in order to improve the manufacturing yield. Performing an effective defect diagnosis can result in low-cost, high-quality products and improves many other manufacturing performance metrics. DM approach can be very useful here as it is one of the best approaches for analyzing vast amount of operational data. DM defined as the nontrivial extraction of implicit, previously unknown and potentially useful information from data (Frawley, Piatetsky-Shapiro, & Matheeus, 1992) is gaining great interest in economic, manufacturing and scientific domains. DM encompasses a number of different technical approaches such as clustering, data summarization, learning classification rules, finding dependency networks and detecting anomalies (Baykasoğlu & Özbakır, 2007). These DM techniques are used to reveal critical information hidden in the data sets. In this study “classification rule extraction” type DM approach is utilized. Classification rule extraction is a common task of data mining which is characterized by a concern for finding highly predictive rules, often by using heuristic techniques. Classification is the process of finding a set of models or functions which describe and distinguish data classes or concepts, for the purpose of being able to use the model to predict the class of objects whose class label is unknown (Han & Kamber, 2001). Classification formulates a classification model based on the analysis of a set of training data. In classification, a rule generally represents discovered knowledge in the form of IF–THEN rules. Here the main goal of a rule extraction system is to obtain insights for data sets.

In order to carry out an effective data mining study we need an effective data mining algorithm which is able to produce high quality, accurate classification rules. For that purpose, in recent years, there have been numerous attempts for applying several algorithms in data mining to accomplish classification task accurately. Artificial neural network (ANN) is one of these models which are most widely used for that purpose. ANN is a mathematical or computational model based on biological neural networks. It consists of interconnected group of artificial neurons. ANNs are excellent at predicting, learning from experiences, and generalizing from previous examples (Fu & Wang, 2001). Due to this fact, many researches focused on applying ANN in the area of classification rule extraction. However, a main disadvantage of ANN is the difficulty of explaining how and what exactly an ANN has learned. In order to overcome this difficulty, researchers tend to develop new approaches for rule extraction from trained ANNs. Elalfi, Haque, and Elalami (2004) presented a new algorithm for extracting accurate and comprehensible classification rules from databases via trained ANN by using genetic algorithms (GA). Their algorithm is not depended on the ANN training algorithms and does not modify the training results. The GA was used to find the optimal values of input attributes, which maximize the output function of output nodes. They decoded the optimal chromosome and used to get a rule belongs to a target class. Markowska-Kaczmar and Wnuk-Lipinski (2004) presented a method for rule extraction from a ANN based on a GA approach with Pareto optimization. They described the idea of Pareto optimization and shown the details of the proposed method. They tested their method with well known benchmark data sets. Li and Wang (2004) proposed a hybrid system to extract classification rules from decision tables. Differently from the previous studies, in their system ANNs are served only as a tool to reduce the decision table and filter its noises while the final rule set is generated from the reduced decision table by using rough sets. They verified the effectiveness of their approach through experiments and making comparisons with traditional rough set and ANN approaches. Tokinaga, Lu, and Ikeda (2005) studied on the use of ANN rule extraction techniques based on genetic programming (GP) to build intelligent and explanatory evolution systems. They utilized the GP to automate the rule extraction process in the trained ANN where the statements changed into a binary classification. As applications, they generated rules to prediction of bankruptcy and creditworthiness for binary classifications, and applied their method to multi-level classification of corporate bonds by using the financial indicators. Hruschka and Ebecken (2006) proposed a clustering-based approach for extracting rules from multi-layer perceptions type ANN in classification problems. Their rule extraction algorithm basically consists of two steps. First, a clustering GA is applied to find clusters of hidden unit activation values. Then classification rules describing these clusters, in relation to the inputs, are generated. They experimentally evaluated the performance of their approach in four datasets that are benchmarks for DM applications. Setiono, Baesens, and Mues (2009) addressed the generation of comprehensible rule sets from trained ANN. Their algorithm is particularly appropriate in applications where comprehensibility as well as accuracy is required. Their experimental results show that their algorithm produces accurate rule sets that are concise and comprehensible. Özbakır, Baykasoğlu, Kulluk, and Yapıcı (2009) presented a study on rule extraction from trained ANNs for classification problems. The proposed method uses ant colony optimization algorithm for extracting accurate and comprehensible rules from databases via trained ANNs. They experimentally evaluated their algorithm on five benchmark data sets and their results show that the proposed algorithm has a potential to generate accurate and concise rules. Recently authors (Özbakır, Baykasoğlu, & Kulluk, 2010) developed a novel algorithm which is named as DIFACONN-miner for classification rule extraction from ANNs. Differently from the previous approaches DIFACONN-miner integrates ANN-training and rule extraction phases. Therefore, classification rules can be directly obtained from ANNs without needing an extra step for rule extraction. This means that at every ANN-training step the corresponding classification rules are simultaneously generated and training is tried to be achieved for generating more accurate classification rules. In fact, DIFACONN-miner can be called as “rule generation algorithm” instead of “rule extraction algorithm”. DIFACONN-miner uses differential evolution (DE) algorithm for training ANNs and touring ant colony optimization (TACO) algorithm for generating classification rules. Fitness of ANN structure is evaluated according to a multiple objective function which consists of three performance measures namely “error of ANN”, “number of rules” and “training accuracy”. The performance DIFACONN-miner was evaluated by the authors on many different test problems and it was proven that DIFACONN-miner is able to produce accurate and effective classification rules. Therefore we selected DIFACONN-miner as the data miner in the present case study. In the following sections of this paper, first a brief overview of DIFACONN-miner is given. Afterwards the case study and obtained results are presented along with some conclusions.

Section snippets

An overview of DIFACONN-miner

DIFACONN-miner is recently developed by authors. In this section a summary of DIFACONN-miner is given for more details refer to (Özbakır et al., 2010). DIFACONN-miner is composed of four interdependent parts which are: “data coding”, “training with differential evolution (DE) algorithm”, “touring ant colony optimization (TACO) algorithm for rule set generation” and “fitness evaluation”. The main structure of DIFACONN-miner is depicted in Fig. 1. DIFACONN-miner is able to work with any data

Case study

In this paper, specific quality defects in a textile manufacturer are taken into consideration in order to determine main causes of them. This firm has a professional information retrieval system and detailed processing data is transferred into database in real time. Several types of data such as technical information of machines, product types, defect types and rates, machine stops and time intervals, machine failures and causes are stored in databases. In order to analyze the performance of

Conclusion

Data mining should be considered in the toolset of researchers and practitioners who want to enable sustainable production. Enormous amount of data is being produced in almost every production company. Extracting useful information from this massive data for reducing scrap, improving energy utilization etc. through data mining technologies can be of much help to enhance sustainability. In this paper, a case study is performed for defect diagnosis in fabric production in a major textile company

Acknowledgement

Professor Baykasoglu is grateful to Turkish Academy of Sciences (TÜBA) for supporting his scientific studies.

References (21)

A. Baykasoğlu et al.
MEPAR-miner: Multi-expression programming for classification rule mining
European Journal of Operational Research
(2007)
E.R. Hruschka et al.
Extracting rules from multilayer perceptions in classification problems: A clustering-based approach
Neurocomputing
(2006)
R. Li et al.
Mining classification rules using rough sets and neural networks
European Journal of Operational Research
(2004)
L. Özbakır et al.
A soft-computing based approach for integrated training and rule extraction from artificial neural networks: DIFACONN-miner
Applied Soft Computing
(2010)
L. Özbakır et al.
TACO-miner: An ant colony based algorithm for rule extraction from trained neural networks
Expert Systems with Applications
(2009)
R. Setiono et al.
A note on knowledge discovery using neural networks and its application to credit card screening
European Journal of Operational Research
(2009)
Bash, C., Patel, C., Shah, A., & Sharma, R. (2008). The sustainable information technology ecosystem. In ITHERM’08,...
E. Elalfi et al.
Extracting rules from trained neural network using GA for managing e-business
Applied Soft Computing
(2004)
E. Frank et al.
W. Frawley et al.
Knowledge discovery in databases: An overview
AI Magazine
(1992)

There are more references available in the full text version of this article.

Cited by (13)

Novel associative classifier based on dynamic adaptive PSO: Application to determining candidates for thoracic surgery
2014, Expert Systems with Applications
Citation Excerpt :
Various rule based classifiers using swarm intelligence inspired techniques have been proposed in literature and applied to real world problems in different domains (Cattral, Oppacher, & Graham, 2009; Fernández, María José del, & Herrera, 2009; Gonzales, Taboada, Mabu, Shimada, & Hirasawa, 2010; Kwasnicka & Switalski, 2005; Liu & Qin, 2004; Mangat, 2011; Melgani & Bazi, 2008; Verhein & Chawla, 2007; Vreeken, van Leeuwen, & Siebes, 2011; Yan, Zhang, & Zhang, 2009). Some of the applications are: advanced swarm intelligence mining algorithm for selecting candidates for surgery for temporal lobe epilepsy (Ghannad-Rezaie, Soltanain-Zadeh, Siadat, & Elisevich, 2006), clustering method based on barebones particle swarms for segmentation of MRI images (Omran & Al-Sharhan, 2007), association rule mining for discovering hyperlipidemia form biochemistry blood parameters (Dogan & Turkoglu, 2008), combined PSO and ACO approach to mine data for use in a pharmacovigilance context (Sordo, Shawn, & Murphy, 2009), handwritten Arabic numerals recognition (Nebti & Boukerram, 2010), DIFACONN-Miner for generating classification rules to find causes of defects (Baykasoglu, Ozbakir, & Kulluk, 2011), classification of images (Wahid, 2011), flow shop scheduling using discrete Artificial Bee Colony and hybrid differential evolution algorithm (Tasgetiren, Pan, et al., 2011), etc. Using Particle Swarm Optimization for performing data mining tasks was suggested in Silva and Neves (2004).
Association rule mining is a data mining technique for discovering useful and novel patterns or relationships from databases. These rules are simple to infer and intuitive and can be easily used for classification in any domain that requires explanation for and investigation into how the classification works. Examples of such areas are medicine, agriculture, education, etc. For such a system to find wide adoptability, it should give output that is correct and comprehensible. The amount of data has been growing very fast and so has the search space of these problems. So we need to change traditional methods. This paper discusses a rule mining classifier called DA-AC (dynamic adaptive-associative classifier) which is based on a Dynamic Particle Swarm Optimizer. Due to its seeding method, exemplar selection, adaptive parameters, dynamic reconstruction of regions and velocity update, it avoids premature convergence and provides a better value in every dimension. Quality evaluation is done both for individual rules as well as entire rulesets. Experiments were conducted over fifteen benchmark datasets to evaluate performance of proposed algorithm in comparison with six other state-of-the-art non associative classifiers and eight associative classifiers. Results demonstrate competitive performance of proposed DA-AC while considering predictive accuracy and number of mined patterns as parameters. The method was then applied to predict life expectancy of post operative thoracic surgery patients.
A hybrid OLAP-association rule mining based quality management system for extracting defect patterns in the garment industry
2013, Expert Systems with Applications
Citation Excerpt :
To perform quality diagnosis, various types of knowledge such as knowledge of the defect problems are required (Deslandres & Pierreval, 1997). Due to the ease of collecting relevant data, performing the analysis and interpreting the results, data mining applications are commonly employed to provide useful feedback to corrective actions for quality improvement (Baykasoglu, Özbakir, & Kulluk, 2011; Köksal, Batmaz, & Testik, 2011). Milne, Drummond, and Renoux (1998) extracted process data and past faults by data mining so as to discover patterns which may cause paper defects.
In today’s garment industry, garment defects have to be minimized so as to fulfill the expectations of demanding customers who seek products of high quality but low cost. However, without any data mining tools to manage massive data related to quality, it is difficult to investigate the hidden patterns among defects which are important information for improving the quality of garments. This paper presents a hybrid OLAP-association rule mining based quality management system (HQMS) to extract defect patterns in the garment industry. The mined results indicate the relationship between defects which serves as a reference for defect prediction, root cause identification and the formulation of proactive measures for quality improvement. Because real-time access to desirable information is crucial for survival under the severe competition, the system is equipped with Online Analytical Processing (OLAP) features so that manufacturers are able to explore the required data in a timely manner. The integration of OLAP and association rule mining allows data mining to be applied on a multidimensional basis. A pilot run of the HQMS is undertaken in a garment manufacturing company to demonstrate how OLAP and association rule mining are effective in discovering patterns among product defects. The results indicate that the HQMS contributes significantly to the formulation of quality improvement in the industry.
Differential Evolution for automatic rule extraction from medical databases
2013, Applied Soft Computing Journal
Citation Excerpt :
Highly appealing here is DE feature of being often faster than the other evolutionary algorithms in converging to high-quality solutions in terms of lower number of evaluations. Up to now DE has been used in classification tasks in combination with other tools as neural networks [15–54], bayes-based methods [24,25], fuzzy logic tools [26], nearest neighbor [27,28], and so on, but just seldom has it been applied on its own [29–33]. More importantly, DE has never been used for carrying out by itself the task of extracting classification rules from databases, as it can be noted in [34], where a wide list of applications of DE is reported.
In this paper, a new approach based on Differential Evolution (DE) for the automatic classification of items in medical databases is proposed. Based on it, a tool called DEREx is presented, which automatically extracts explicit knowledge from the database under the form of IF-THEN rules containing AND-connected clauses on the database variables. Each DE individual codes for a set of rules. For each class more than one rule can be contained in the individual, and these rules can be seen as logically connected in OR. Furthermore, all the classifying rules for all the classes are found all at once in one step. DEREx is thought as a useful support to decision making whenever explanations on why an item is assigned to a given class should be provided, as it is the case for diagnosis in the medical domain. The major contribution of this paper is that DEREx is the first classification tool in literature that is based on DE and automatically extracts sets of IF-THEN rules without the intervention of any other mechanism. In fact, all other classification tools based on DE existing in literature either simply find centroids for the classes rather than extracting rules, or are hybrid systems in which DE simply optimizes some parameters whereas the classification capabilities are provided by other mechanisms. For the experiments eight databases from the medical domain have been considered. First, among ten classical DE variants, the most effective of them in terms of highest classification accuracy in a ten-fold cross-validation has been found. Secondly, the tool has been compared over the same eight databases against a set of fifteen classifiers widely used in literature. The results have proven the effectiveness of the proposed approach, since DEREx turns out to be the best performing tool in terms of highest classification accuracy. Also statistical analysis has confirmed that DEREx is the best classifier. When compared to the other rule-based classification tools here used, DEREx needs the lowest average number of rules to face a problem, and the average number of clauses per rule is not very high. In conclusion, the tool here presented is preferable to the other classifiers because it shows good classification accuracy, automatically extracts knowledge, and provides users with it under an easily comprehensible form.
Training neural networks with harmony search algorithms for classification problems
2012, Engineering Applications of Artificial Intelligence
Citation Excerpt :
Data related to the attributes are collected from different databases and combined into the single database. The attributes that have possible effects on this quality defect are shown in Table 11 (Baykasoglu et al., in press). Knot defect dataset consists of 230 instances and well-known ten-fold cross-validation procedure is also applied to this dataset as in benchmark datasets.
Training neural networks (NNs) is a complex task of great importance in the supervised learning area. However, performance of the NNs is mostly dependent on the success of training process, and therefore the training algorithm. This paper addresses the application of harmony search algorithms for the supervised training of feed-forward (FF) type NNs, which are frequently used for classification problems. In this paper, five different variants of harmony search algorithm are studied by giving special attention to Self-adaptive Global Best Harmony Search (SGHS) algorithm. A structure suitable to data representation of NNs is adapted to SGHS algorithm. The technique is empirically tested and verified by training NNs on six benchmark classification problems and a real-world problem. Among these benchmark problems two of them have binary classes and remaining four are n-ary classification problems. Real-world problem is related to the classification of most frequently encountered quality defect in a major textile company in Turkey. Overall training time, sum of squared errors, training and testing accuracies of SGHS algorithm, is compared with the other harmony search algorithms and the most widely used standard back-propagation (BP) algorithm. The experiments presented that the SGHS algorithm lends itself very well to the training of NNs and also highly competitive with the compared methods in terms of classification accuracy.
A Case Study with the BEE-Miner Algorithm: Defects on the Production Line
2023, Springer Series in Advanced Manufacturing
Forecasting the level of expert knowledge using the GMDH method
2019, Communications in Computer and Information Science

View all citing articles on Scopus

View full text

Classifying defect factors in fabric production via DIFACONN-miner: A case study

Abstract

Research highlights

Introduction

Section snippets

An overview of DIFACONN-miner

Case study

Conclusion

Acknowledgement

European Journal of Operational Research

Neurocomputing

European Journal of Operational Research

Applied Soft Computing

Expert Systems with Applications

European Journal of Operational Research

Extracting rules from trained neural network using GA for managing e-business

Applied Soft Computing

Knowledge discovery in databases: An overview

AI Magazine