Abstract
A real-life reliability system is proposed by fusing the field warranty failure data with the failure modes extracted from unstructured repair verbatim data by using the ontology-based natural language processing technique to facilitate accurate estimation of component reliability. Traditionally, the reliability estimation process uses the warranty data, but it provides limited support to handle the “failure confounding” problem, whereby different failure modes associated with a component failure are confounded into a single failure mode. The resulting reliability estimation lacks the required level of precision. Because our model takes into account textual failure modes associated with component failures, it enhances the overall reliability estimation. The performance of our system is evaluated with the baseline system for predicting absolute errors by using the real-life data from the automotive domain, e.g., headlamp failure, collected at different miles exposures. In the best case, the absolute errors predicted by our model showed an improvement of 97 % with respect to the baseline model (without considering the failure modes), while in worst case, it was 71 %.









Similar content being viewed by others
Notes
The experimental results in Jung and Bai [16] show that when age and usage variables are strongly correlated, the performance of univariate and bivariate approaches is comparable. However, in cases of weak correlations, the bivariate approach performs magnitude better than the univariate approach.
Due to the data non-disclosure agreement to the third party, we have given the dummy values of the data collected in the field.
In our domain, the data can be right or left skewed, and to accommodate such a varying nature of data, the Weibull model was used. The Weibull model provided us with the flexibility to model such hazard functions as decreasing, increasing, or constant and to describe different phrases of component’s lifetime.
RDFS is a World Wide Web Consortium (W3C) standard for the specification of meta-data model.
PMI-TR proposed by Turney [41] can be defined as follows: \(Sim(t,t_j )=log_2 (1+\frac{(tANDt_j )^{2}}{hits\left( t \right) .hits(t_j )})\), where \(hits\left( t \right) \) and \(hits(t_j )\) are the number of times \(\left( t \right) \) and \((t_j )\) are observed in the corpus, and hit() is the number of documents where both the tuples are present. The distance proximity between the positions of two words is exploited to construct the tuples
The word widow size is a free parameter, which can be defined in many ways [27]. In our work, we experimented with different word window sizes, such as five, seven, and ten. The large word window sizes, i.e., seven and ten words, yielded noisy results; hence, we settled for the word window of five words while constructing the tuples.
References
Beckett D (eds) (2004) RDF/XML Syntax Specification (Revised), W3C Recommendation, 2004. http://www.w3.org/TR/rdf-syntax-grammar/
Benedittini O, Baines TS, Lightfoot HW, Greenough RM (2009) State-of-the-art in integrated vehicle health management. J Aerosp Eng 223(2):157–170
Boufaden N (2003) An ontology-based semantic tagger for IE system. In: Proceedings of the 41st annual meeting on association for computational linguistics, Stroudsburg, PA, USA, pp 7–14
Brill E (1995) Unsupervised learning of disambiguation rules for part of speech tagging. In: Yarowsky D, Church K (eds) Natural language processing using very large corpora. Kluwer Academic Press, Cambridge, Massachusetts, pp 1–13
Caers JFJM, Zhao XJ, Mooren J, Stulens L, Eggink E (2010) Design for reliability—a reliability engineering framework. In: Proceedings of the 11th international conference on electronic packaging technology & high density packaging, pp 1108–1113
Church K (1988) A stochastic parts program and noun phrase parser for unrestricted text. In: Proceedings of the second conference on applied natural language processing, Stroudsburg, PA, USA, pp 136–143
Charniak E, Hendrickson C, Jacobson N, Perkowitz M (1993) Equations for part of speech tagging. In: Proceedings of the conference of the American Association for Artificial Intelligence, Menlo Park,pp 784–789
Cimiano P, Hotho A, Staab S (2005) Learning concept hierarchies from text corpora using formal concept analysis. J Art Intel Res 24:305–339
Coit DW, Dey KA (1999) Analysis of grouped data from field-failure reporting systems. Reliab Eng Syst Saf 65(2):95–101
DeRose S (1988) Grammatical category disambiguation by statistical optimization. Comput Linguist 14(1):31–39
Deroualt A, Merialdo B (1986) Natural language modeling for phoneme-to-text transcription. IEEE Trans Pattern Anal Mach Intel 8(6):742–749
Emmanuel R, Schabes Y (1995) Deterministic POS tagging with finite-state transducers. Comput Linguist 21(2):227–253
Greene BB, Rubin GM (1971) Automatic grammatical tagging of English. Department of Linguistics, Brown University, Providence, Rhode Island, Technical report
Hindle D (1989) Acquiring disambiguation rules from text. In: Proceedings of the 27th annual meeting of the association for computational linguistics. Vancouver, British Columbia, pp 118–125
Hunter JJ (1974) Renewal theory in two dimensions: basic results. Adv Appl Probab 6:376–391
Jung M, Bai DS (2007) Analysis of field data under two-dimensional warranty. Reliab Eng Syst Saf 92(2):135–143
Kalbfleisch JD, Lawless JF (1988) Estimation of reliability in field-performance studies. Technometrics 30:365–388
Kalbfleish JD, Lawless JF, Robinson JA (1991) Methods for the analysis and prediction of warranty claims. Technometrics 33:273–285
Karim MR, Suzuki K (2005) Analysis of warranty claim data: a literature review. Int J Qual Reliab Manag 22(7):667–686
Klein S, Simmons R (1963) A computational approach to grammatical coding of English words. J ACM 10(3):334–347
Ken S (1998) Lazy transformation-based learning. In: Proceedings of the 11th international Florida artificial intelligence research symposium conference, Sanibel Island, Florida, USA, pp 235–239
Kilgarriff A, Rychly P, Smrz P, Tugwell D (2004) The sketch engine. In: Proceedings of Euralex, Lorient, France, pp 105–116
Kleyner A, Sandborn P (2008) Minimizing life cycle cost by managing product dependability via validation plan and warranty return cost. Int J Prod Econ 112(2):796–807
Lawless JF (1983) Statistical methods in reliability. Technometrics 25:305–335
Lawless JF, Hu J, Cao J (1995) Methods for the estimation of failure distributions and rates from automobile warranty data. Lifetime Data Anal 1:227–240
Lawless JF (1998) Statistical analysis of product warranty data. Int Stat Rev 66:41–60
Lund K, Burgess C (1996) Producing high-dimensional semantic spaces using lexical co-occurrence. Behav Res Methods 28(2):203–208
Majeske KD, Caris TL, Herrin G (1997) Evaluating product and process design changes with warranty data. Int J Prod Econ 50:79–89
Majeske KD (2003) A mixture model for automobile warranty data. Reliab Eng and Sys Saf 81:71–77
Meteer M, Schwartz R, Weischedel R (1991) POST: using probabilities in language processing. In: Proceedings of the twelfth international conference on artificial intelligence, pp 960–965
Mukheerje S, Chakraborty A (2007) Automated fault tree generation: bridging reliability with text mining. In: Proceedings of reliability and maintainability symposium, Orlando FL, pp 83–88
Murthy DNP, Blischke WR (1992) Product warranty management—III: a review of mathematical models. Eur J Oper Res 62:1–34
Murthy DNP, Iskandar BP, Wilson RJ (1995) Two-dimensional failure-free warranty policies: two-dimensional point process models. Oper Res 43(2):356–366
Ngai G, Radu F (2001) Transformation-based learning in the fast lane.In; Proceedings of the second conference of the North American chapter of the association for computational linguistics, Pittsburgh, PA, pp 1–8
Oh YS, Bai DS (2001) Field data analyses with additional after warranty field-data. Reliab Eng Syst Saf 72(1):1–8
Radu F, Henderson J C, Ngai G (2000) Coaxing confidences from an old friend: probabilistic classifications from transformation rule lists. In: Proceedings of joint SIGDAT conference on empirical methods in natural language processing and very large corpora, pp 26–34
Rajpathak D, Chougule R (2011) A generic ontology development framework for data integration and decision support in a distributed environment. Int J Comput Integr Manuf 24(2):154–170
Rajpathak D, Chougule R, Bandyopadhyay P (2011) A domain specific decision support system for knowledge discovery using association and text mining. Int J Knowl Inf Syst 31(3):405–432
Rajpathak D (2013) An ontology based text mining system for knowledge discovery from the diagnosis data in the automotive domain. Int J Comput Ind 64(5):565–580
Singpurwalla ND, Wilson SP (1994) Software reliability modeling. Int Stat Rev 62:289–317
Turney P (2001) Mining the web for synonyms: PMI-IR versus LSA on TOEFL. In: Proceedings of the twelfth European conference on machine learning, Freiburg, Germany, pp 491–502
Wasserman GS (1992) An application of dynamic linear models for predicting warranty claims. Comput Ind Eng 22(1):37–47
Wessel F (2002) Word posterior probabilities for large vocabulary continuous speech recognition. Ph.D. thesis, RWTH Aachen University. Aachen, Germany
Acknowledgments
The authors would like to thank reviewers and GM’s internal paper review committee for providing valuable comments on the earlier drafts of this manuscript.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Rajpathak, D., De, S. A data- and ontology-driven text mining-based construction of reliability model to analyze and predict component failures. Knowl Inf Syst 46, 87–113 (2016). https://doi.org/10.1007/s10115-014-0806-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-014-0806-3