Elsevier

Expert Systems with Applications

Volume 38, Issue 9, September 2011, Pages 11321-11328
Expert Systems with Applications

Classifying defect factors in fabric production via DIFACONN-miner: A case study

https://doi.org/10.1016/j.eswa.2011.02.182Get rights and content

Abstract

In this paper a data mining based case study is carried out in a major textile company in Turkey in order to classify and analyze the defect factors in their fabric production process. It is aimed to understand the causes of the defects in order to minimize their occurrence. The main motivation behind this study is to minimize scrap loses in the company and enabling more sustainable production via data mining. In the analyses, a data mining tool (DIFACONN-miner) that was recently developed by authors is employed. DIFACONN-miner is a novel data mining tool which combines several metaheuristics and artificial neural networks intelligently and it is capable of producing comprehensive classification rules from any data type.

Research highlights

► In this paper a novel approach is provided to classify quality defects in fabric production. ► A real life case study and its results are provided. ► A novel data mining tool, DIFACONN-miner which was recently provided by the authors was utilized to model and solve the problem. ► It is shown that DIFACONN-miner is able to model and classify quality defects in fabric production.

Introduction

Employing data mining in analyzing manufacturing data can be a great support to sustainable and green manufacturing. This is mainly due to fact that knowledge gathered via data mining (DM) approach can be used to improve manufacturing productivity, reduce scrap rates and minimize reworking. All these improvements mean less material and energy consumption which is a prerequisite for sustainable and green manufacturing practices. Based on this motivation, it is decided to carry out a data mining based study for defect diagnosis in fabric production in a major textile company in Turkey. Several other authors’ from the literature have also recognized that data mining can be a very useful and enabling tool for sustainable manufacturing. For example; Modapothala and Issac (2009) employed data mining for analyzing corporate environmental reports. They conclude that their findings confirm reliable direction in sustainability with the use of data mining techniques. Marwah, Sharma, Ramakrishnan, Bash, and Patel (2009a) pointed out that considering the large amount of data produced by physical ecosystems, manual inspection is virtually impossible and thus automated knowledge discovery and data mining techniques are required to synthesize models for enabling sustainable end-to-end operation and management of physical ecosystems. They defined physical ecosystem as a collection of entities and processes that consume physical resources and fulfill functionality. They proposed a data mining framework which uses sustainability metrics such as energy, power consumption and carbon footprint to quantify the desirability of operating the system. Their aim was to monitor the data patterns and determine if it is possible to transform the current operational state to another more efficient one. They mentioned that, domain knowledge required to move from one state to another, where feasible, can be encoded into actionable rules which can be generated through data mining. Several other similar conclusions can be found in (Bash et al., 2008, Marwah et al., 2009b).

In recent years defect diagnosis has become an important activity for ensuring high quality production. It involves the identification of the defect information, which can then be used to adjust manufacturing process accordingly in order to improve the manufacturing yield. Performing an effective defect diagnosis can result in low-cost, high-quality products and improves many other manufacturing performance metrics. DM approach can be very useful here as it is one of the best approaches for analyzing vast amount of operational data. DM defined as the nontrivial extraction of implicit, previously unknown and potentially useful information from data (Frawley, Piatetsky-Shapiro, & Matheeus, 1992) is gaining great interest in economic, manufacturing and scientific domains. DM encompasses a number of different technical approaches such as clustering, data summarization, learning classification rules, finding dependency networks and detecting anomalies (Baykasoğlu & Özbakır, 2007). These DM techniques are used to reveal critical information hidden in the data sets. In this study “classification rule extraction” type DM approach is utilized. Classification rule extraction is a common task of data mining which is characterized by a concern for finding highly predictive rules, often by using heuristic techniques. Classification is the process of finding a set of models or functions which describe and distinguish data classes or concepts, for the purpose of being able to use the model to predict the class of objects whose class label is unknown (Han & Kamber, 2001). Classification formulates a classification model based on the analysis of a set of training data. In classification, a rule generally represents discovered knowledge in the form of IF–THEN rules. Here the main goal of a rule extraction system is to obtain insights for data sets.

In order to carry out an effective data mining study we need an effective data mining algorithm which is able to produce high quality, accurate classification rules. For that purpose, in recent years, there have been numerous attempts for applying several algorithms in data mining to accomplish classification task accurately. Artificial neural network (ANN) is one of these models which are most widely used for that purpose. ANN is a mathematical or computational model based on biological neural networks. It consists of interconnected group of artificial neurons. ANNs are excellent at predicting, learning from experiences, and generalizing from previous examples (Fu & Wang, 2001). Due to this fact, many researches focused on applying ANN in the area of classification rule extraction. However, a main disadvantage of ANN is the difficulty of explaining how and what exactly an ANN has learned. In order to overcome this difficulty, researchers tend to develop new approaches for rule extraction from trained ANNs. Elalfi, Haque, and Elalami (2004) presented a new algorithm for extracting accurate and comprehensible classification rules from databases via trained ANN by using genetic algorithms (GA). Their algorithm is not depended on the ANN training algorithms and does not modify the training results. The GA was used to find the optimal values of input attributes, which maximize the output function of output nodes. They decoded the optimal chromosome and used to get a rule belongs to a target class. Markowska-Kaczmar and Wnuk-Lipinski (2004) presented a method for rule extraction from a ANN based on a GA approach with Pareto optimization. They described the idea of Pareto optimization and shown the details of the proposed method. They tested their method with well known benchmark data sets. Li and Wang (2004) proposed a hybrid system to extract classification rules from decision tables. Differently from the previous studies, in their system ANNs are served only as a tool to reduce the decision table and filter its noises while the final rule set is generated from the reduced decision table by using rough sets. They verified the effectiveness of their approach through experiments and making comparisons with traditional rough set and ANN approaches. Tokinaga, Lu, and Ikeda (2005) studied on the use of ANN rule extraction techniques based on genetic programming (GP) to build intelligent and explanatory evolution systems. They utilized the GP to automate the rule extraction process in the trained ANN where the statements changed into a binary classification. As applications, they generated rules to prediction of bankruptcy and creditworthiness for binary classifications, and applied their method to multi-level classification of corporate bonds by using the financial indicators. Hruschka and Ebecken (2006) proposed a clustering-based approach for extracting rules from multi-layer perceptions type ANN in classification problems. Their rule extraction algorithm basically consists of two steps. First, a clustering GA is applied to find clusters of hidden unit activation values. Then classification rules describing these clusters, in relation to the inputs, are generated. They experimentally evaluated the performance of their approach in four datasets that are benchmarks for DM applications. Setiono, Baesens, and Mues (2009) addressed the generation of comprehensible rule sets from trained ANN. Their algorithm is particularly appropriate in applications where comprehensibility as well as accuracy is required. Their experimental results show that their algorithm produces accurate rule sets that are concise and comprehensible. Özbakır, Baykasoğlu, Kulluk, and Yapıcı (2009) presented a study on rule extraction from trained ANNs for classification problems. The proposed method uses ant colony optimization algorithm for extracting accurate and comprehensible rules from databases via trained ANNs. They experimentally evaluated their algorithm on five benchmark data sets and their results show that the proposed algorithm has a potential to generate accurate and concise rules. Recently authors (Özbakır, Baykasoğlu, & Kulluk, 2010) developed a novel algorithm which is named as DIFACONN-miner for classification rule extraction from ANNs. Differently from the previous approaches DIFACONN-miner integrates ANN-training and rule extraction phases. Therefore, classification rules can be directly obtained from ANNs without needing an extra step for rule extraction. This means that at every ANN-training step the corresponding classification rules are simultaneously generated and training is tried to be achieved for generating more accurate classification rules. In fact, DIFACONN-miner can be called as “rule generation algorithm” instead of “rule extraction algorithm”. DIFACONN-miner uses differential evolution (DE) algorithm for training ANNs and touring ant colony optimization (TACO) algorithm for generating classification rules. Fitness of ANN structure is evaluated according to a multiple objective function which consists of three performance measures namely “error of ANN”, “number of rules” and “training accuracy”. The performance DIFACONN-miner was evaluated by the authors on many different test problems and it was proven that DIFACONN-miner is able to produce accurate and effective classification rules. Therefore we selected DIFACONN-miner as the data miner in the present case study. In the following sections of this paper, first a brief overview of DIFACONN-miner is given. Afterwards the case study and obtained results are presented along with some conclusions.

Section snippets

An overview of DIFACONN-miner

DIFACONN-miner is recently developed by authors. In this section a summary of DIFACONN-miner is given for more details refer to (Özbakır et al., 2010). DIFACONN-miner is composed of four interdependent parts which are: “data coding”, “training with differential evolution (DE) algorithm”, “touring ant colony optimization (TACO) algorithm for rule set generation” and “fitness evaluation”. The main structure of DIFACONN-miner is depicted in Fig. 1. DIFACONN-miner is able to work with any data

Case study

In this paper, specific quality defects in a textile manufacturer are taken into consideration in order to determine main causes of them. This firm has a professional information retrieval system and detailed processing data is transferred into database in real time. Several types of data such as technical information of machines, product types, defect types and rates, machine stops and time intervals, machine failures and causes are stored in databases. In order to analyze the performance of

Conclusion

Data mining should be considered in the toolset of researchers and practitioners who want to enable sustainable production. Enormous amount of data is being produced in almost every production company. Extracting useful information from this massive data for reducing scrap, improving energy utilization etc. through data mining technologies can be of much help to enhance sustainability. In this paper, a case study is performed for defect diagnosis in fabric production in a major textile company

Acknowledgement

Professor Baykasoglu is grateful to Turkish Academy of Sciences (TÜBA) for supporting his scientific studies.

References (21)

There are more references available in the full text version of this article.

Cited by (13)

  • Novel associative classifier based on dynamic adaptive PSO: Application to determining candidates for thoracic surgery

    2014, Expert Systems with Applications
    Citation Excerpt :

    Various rule based classifiers using swarm intelligence inspired techniques have been proposed in literature and applied to real world problems in different domains (Cattral, Oppacher, & Graham, 2009; Fernández, María José del, & Herrera, 2009; Gonzales, Taboada, Mabu, Shimada, & Hirasawa, 2010; Kwasnicka & Switalski, 2005; Liu & Qin, 2004; Mangat, 2011; Melgani & Bazi, 2008; Verhein & Chawla, 2007; Vreeken, van Leeuwen, & Siebes, 2011; Yan, Zhang, & Zhang, 2009). Some of the applications are: advanced swarm intelligence mining algorithm for selecting candidates for surgery for temporal lobe epilepsy (Ghannad-Rezaie, Soltanain-Zadeh, Siadat, & Elisevich, 2006), clustering method based on barebones particle swarms for segmentation of MRI images (Omran & Al-Sharhan, 2007), association rule mining for discovering hyperlipidemia form biochemistry blood parameters (Dogan & Turkoglu, 2008), combined PSO and ACO approach to mine data for use in a pharmacovigilance context (Sordo, Shawn, & Murphy, 2009), handwritten Arabic numerals recognition (Nebti & Boukerram, 2010), DIFACONN-Miner for generating classification rules to find causes of defects (Baykasoglu, Ozbakir, & Kulluk, 2011), classification of images (Wahid, 2011), flow shop scheduling using discrete Artificial Bee Colony and hybrid differential evolution algorithm (Tasgetiren, Pan, et al., 2011), etc. Using Particle Swarm Optimization for performing data mining tasks was suggested in Silva and Neves (2004).

  • A hybrid OLAP-association rule mining based quality management system for extracting defect patterns in the garment industry

    2013, Expert Systems with Applications
    Citation Excerpt :

    To perform quality diagnosis, various types of knowledge such as knowledge of the defect problems are required (Deslandres & Pierreval, 1997). Due to the ease of collecting relevant data, performing the analysis and interpreting the results, data mining applications are commonly employed to provide useful feedback to corrective actions for quality improvement (Baykasoglu, Özbakir, & Kulluk, 2011; Köksal, Batmaz, & Testik, 2011). Milne, Drummond, and Renoux (1998) extracted process data and past faults by data mining so as to discover patterns which may cause paper defects.

  • Differential Evolution for automatic rule extraction from medical databases

    2013, Applied Soft Computing Journal
    Citation Excerpt :

    Highly appealing here is DE feature of being often faster than the other evolutionary algorithms in converging to high-quality solutions in terms of lower number of evaluations. Up to now DE has been used in classification tasks in combination with other tools as neural networks [15–54], bayes-based methods [24,25], fuzzy logic tools [26], nearest neighbor [27,28], and so on, but just seldom has it been applied on its own [29–33]. More importantly, DE has never been used for carrying out by itself the task of extracting classification rules from databases, as it can be noted in [34], where a wide list of applications of DE is reported.

  • Training neural networks with harmony search algorithms for classification problems

    2012, Engineering Applications of Artificial Intelligence
    Citation Excerpt :

    Data related to the attributes are collected from different databases and combined into the single database. The attributes that have possible effects on this quality defect are shown in Table 11 (Baykasoglu et al., in press). Knot defect dataset consists of 230 instances and well-known ten-fold cross-validation procedure is also applied to this dataset as in benchmark datasets.

  • Forecasting the level of expert knowledge using the GMDH method

    2019, Communications in Computer and Information Science
View all citing articles on Scopus
View full text