Abstract
We propose a novel association and text mining system for knowledge discovery (ASTEK) from the warranty and service data in the automotive domain. The complex architecture of modern vehicles makes fault diagnosis and isolation a non-trivial task. The association mining isolates anomaly cases from the millions of service and claims records. ASTEK has shown 86% accuracy in correctly identifying the anomaly cases. The text mining subscribes to the diagnosis and prognosis (D&P) ontology, which provides the necessary domain-specific knowledge. The root causes associated with the anomaly cases are identified by discovering frequent symptoms associated with the part failures along with the repair actions used to fix the part failures. The best-practice knowledge is disseminated to the dealers involved in the anomaly cases. ASTEK has been implemented as a prototype in the service and quality department of GM and its performance has been validated in the real life set up. On an average, the analysis time is reduced from few weeks to few minutes, which in real life industry are significant improvements.
Similar content being viewed by others
References
Agarwal R, Imielinski T, Swami A (1993) Mining association rules between sets of items in large databases, In: Proceedings of the 1993 ACM SIGMOD conference. Washington DC, USA, pp 207–216
Agosti M, Ferro N (2005) Annotations as context for searching documents, In: Crestani F, Ruthven I (eds) Proceedings of the 5th international conference on conceptions of library and information science—context: nature, impact and role, Lecture Notes in Computer Science, Springer, Heidelberg, Germany, pp 155–170
Beckett D (ed). RDF/XML Syntax Specification (Revised), W3C Recommendation, 2004. http://www.w3.org/TR/rdf-syntax-grammar/
Benedittini O, Baines TS, Lightfoot HW, Greenough RM (2009) State-of-the-art in integrated vehicle health management. J Aer Eng 223(2): 157–170
Bloehdorn S, Cimiano P, Hotho A, Staab S (2005) An ontology-based framework for text mining. LDV Forum 20(1): 87–112
Buddhakulsomsiri J, Zakarian A (2009) Sequential pattern mining algorithm for automotive warranty data. Comput Ind Eng 57(1): 137–147
Chougule R, Chakrabarty S (2009) Application of ontology guided search for improved equipment diagnosis in a vehicle assembly plant. In: Proceedings of fifth annual IEEE conference on automation science and engineering (IEEE CASE 2009). IEEE Press, Bangalore, India, pp 90–95
Cios KJ, Pedrycz W, Świniarski RW (1998) Data mining methods for knowledge discovery. Kluwer, Norwell
Corcho O (2006) Ontology based document annotation: trends and open research problems. Int J Metadata Semant Ontol 1(1): 47–57
Cunningham H (2002) GATE, a general architecture for text engineering. Comput Humanit 36: 223–254
Davi A, Haughton D, Nasr N, Shah G, Skaletsky M, Spack R (2005) A review of two text-mining packages: SAS TextMining and WordStat. (Product/Service Evaluation). Am Stat 59(1): 89–103
Dean, PM (eds) (1995) Molecular similarity in drug design. Blackie Academic & Professional, London, pp 111–137
Fensel D, Straatman R (1998) The essence of problem-solving methods: making assumptions to gain efficiency. Int J Human-Comput Stud 48: 181–215
Francisco V, Gervas P, Peinado F (2010) Ontological reasoning for improving the treatment of emotions in text. Knowl Inf Syst 25: 421–443
Gruber TR (1993) A translation approach to portable ontology specifications. Knowl Acq 5(2): 199–220
Gusikhin O, Rychtyckyj N, Filev D (2007) Intelligent systems in the automotive industry: applications and trends. Knowl Inf Syst 12(2): 147–168
Hearst T (1999) Untangling text data mining. University of Maryland, College Park, pp 3–10
Janasak KM, Beshears RR (2007) Diagnostics to prognostics—a product technology evolution, In: Proceedgins of the 2007 reliability and maintainability symposium—RAMS’07. Orlando, Florida, USA
Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv 31(3): 264–323
Jing Y, Choi Y, Xiong Y, Han K, Shin S, Lee Y (2007) A knowledge acquisition and management system for fault diagnosis and maintenance of equipments, In: Proceedings of the 6th WSEAS international conference on applied computer science. Hangzhou, China, pp 296–300
Jing L, Ng KM, Huang JZ (2009) Knowledge-based vector space model for text clustering. Knowl Inf Syst 25(1): 35–55
Kotsiantis S, Kanellopoulos D (2006) Association rules mining: a recent overview. Int Trans Comput Sci Eng 32(1): 71–82
Kuehnast J, Hengeveld W (2009) Enterprise application integration (white paper). T-systems enterprise services. GmbH, Berlin
Luhn HP (1960) Keyword in context index for technical literature (KWIC Index). Am Docu 11: 288–295
Li J-q, Niu C-l, Liu J-z, Zhang L-y (2006) Research and application of data mining in power plant process control and optimization. Lec Notes Comp Sci 3930: 149–158
Ovsiannikov IA, Arbib MA, Mcneill TH (1999) Annotation technology. Int J Human-Comput Stud 50(4): 329–362
Palmer DD, Hearst MA (1994) Adaptive sentence boundary disambiguation. Report No. UCB/CSD 94/797
Quan X, Liu G, Lu Z, Ni X, Wenyin L (2010) Short text similarity based on probabilistic topics. Knowl Inf Syst 25: 473–491
Rajpathak D, Motta E, Zrahal Z, Roy R (2006) A generic library of problem solving methods for scheduling applications. IEEE Trans Knowl Data Eng 18(6): 815–828
Salton G, McGill MJ (1983) Introduction to modern information retrieval. McGraw-Hill, New York
Saxena A, Wu B, Vachtsevanos G (2005) Integrated diagnosis and prognosis architecture for fleet vehicles using dynamic case based reasoning. In: Proceedings of the IEEE Autotestcon, pp 96–102
Stevenson M, Gaizauskas R (2000) Experiments on sentence boundary detection. In: Proceedings of the 6th conference on applied natural language processing. Seattle, USA, pp 84–89
Tan P, Kumar V, Srivastava J (2002) Selecting the right interestingness measure for association patterns. In: Proceedings of the SIGKDD’02 conference. Edmonton, Alberta, Canada, pp 32–41
Venkatasubramanian V, Rengaswamy R, Yin K, Kavuri S (2003) A review of process fault detection and diagnosis part I: quantitative model-based methods. Comput Chem Eng 27: 293–311
Venkatasubramanian V, Rengaswamy R, Kavuri S (2003) A review of process fault detection and diagnosis part II: qualitative models and search strategies. Comput Chem Eng 27: 313–326
Venkatasubramanian V, Rengaswamy R, Yin K, Kavuri S (2003) A review of process fault detection and diagnosis part III: process history based methods. Comput Chem Eng 27: 327–346
Wang S, Hsu S (2004) A Web-based CBR knowledge management system for PC troubleshooting. Int J Adv Manuf Tech 23(7–8): 532–540
Williams Z (2006) Benefits of IVHM: an analytical approach. In: Proceedings of the Aerospace Conference. Big Sky, Montana, USA
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Rajpathak, D., Chougule, R. & Bandyopadhyay, P. A domain-specific decision support system for knowledge discovery using association and text mining. Knowl Inf Syst 31, 405–432 (2012). https://doi.org/10.1007/s10115-011-0409-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-011-0409-1