A data- and ontology-driven text mining-based construction of reliability model to analyze and predict component failures

Rajpathak, Dnyanesh; De, Soumen

doi:10.1007/s10115-014-0806-3

A data- and ontology-driven text mining-based construction of reliability model to analyze and predict component failures

Regular Paper
Published: 11 January 2015

Volume 46, pages 87–113, (2016)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Dnyanesh Rajpathak¹ &
Soumen De¹

1011 Accesses
18 Citations
3 Altmetric
Explore all metrics

Abstract

A real-life reliability system is proposed by fusing the field warranty failure data with the failure modes extracted from unstructured repair verbatim data by using the ontology-based natural language processing technique to facilitate accurate estimation of component reliability. Traditionally, the reliability estimation process uses the warranty data, but it provides limited support to handle the “failure confounding” problem, whereby different failure modes associated with a component failure are confounded into a single failure mode. The resulting reliability estimation lacks the required level of precision. Because our model takes into account textual failure modes associated with component failures, it enhances the overall reliability estimation. The performance of our system is evaluated with the baseline system for predicting absolute errors by using the real-life data from the automotive domain, e.g., headlamp failure, collected at different miles exposures. In the best case, the absolute errors predicted by our model showed an improvement of 97 % with respect to the baseline model (without considering the failure modes), while in worst case, it was 71 %.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A data-driven approach for constructing the component-failure mode matrix for FMEA

Article 18 February 2019

A Text Understandability Approach for Improving Reliability-Centered Maintenance in Manufacturing Enterprises

Identification of Variables Impacting Cascading Failures in Aerospace Systems: A Natural Language Processing Approach

Notes

The experimental results in Jung and Bai [16] show that when age and usage variables are strongly correlated, the performance of univariate and bivariate approaches is comparable. However, in cases of weak correlations, the bivariate approach performs magnitude better than the univariate approach.
Due to the data non-disclosure agreement to the third party, we have given the dummy values of the data collected in the field.
In our domain, the data can be right or left skewed, and to accommodate such a varying nature of data, the Weibull model was used. The Weibull model provided us with the flexibility to model such hazard functions as decreasing, increasing, or constant and to describe different phrases of component’s lifetime.
RDFS is a World Wide Web Consortium (W3C) standard for the specification of meta-data model.
https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html.
PMI-TR proposed by Turney [41] can be defined as follows: $Sim(t,t_j )=log_2 (1+\frac{(tANDt_j )^{2}}{hits\left( t \right) .hits(t_j )})$, where $hits\left( t \right) $ and $hits(t_j )$ are the number of times $\left( t \right) $ and $(t_j )$ are observed in the corpus, and hit() is the number of documents where both the tuples are present. The distance proximity between the positions of two words is exploited to construct the tuples
The word widow size is a free parameter, which can be defined in many ways [27]. In our work, we experimented with different word window sizes, such as five, seven, and ten. The large word window sizes, i.e., seven and ten words, yielded noisy results; hence, we settled for the word window of five words while constructing the tuples.

References

Beckett D (eds) (2004) RDF/XML Syntax Specification (Revised), W3C Recommendation, 2004. http://www.w3.org/TR/rdf-syntax-grammar/
Benedittini O, Baines TS, Lightfoot HW, Greenough RM (2009) State-of-the-art in integrated vehicle health management. J Aerosp Eng 223(2):157–170
Google Scholar
Boufaden N (2003) An ontology-based semantic tagger for IE system. In: Proceedings of the 41st annual meeting on association for computational linguistics, Stroudsburg, PA, USA, pp 7–14
Brill E (1995) Unsupervised learning of disambiguation rules for part of speech tagging. In: Yarowsky D, Church K (eds) Natural language processing using very large corpora. Kluwer Academic Press, Cambridge, Massachusetts, pp 1–13
Caers JFJM, Zhao XJ, Mooren J, Stulens L, Eggink E (2010) Design for reliability—a reliability engineering framework. In: Proceedings of the 11th international conference on electronic packaging technology & high density packaging, pp 1108–1113
Church K (1988) A stochastic parts program and noun phrase parser for unrestricted text. In: Proceedings of the second conference on applied natural language processing, Stroudsburg, PA, USA, pp 136–143
Charniak E, Hendrickson C, Jacobson N, Perkowitz M (1993) Equations for part of speech tagging. In: Proceedings of the conference of the American Association for Artificial Intelligence, Menlo Park,pp 784–789
Cimiano P, Hotho A, Staab S (2005) Learning concept hierarchies from text corpora using formal concept analysis. J Art Intel Res 24:305–339
MATH Google Scholar
Coit DW, Dey KA (1999) Analysis of grouped data from field-failure reporting systems. Reliab Eng Syst Saf 65(2):95–101
Article Google Scholar
DeRose S (1988) Grammatical category disambiguation by statistical optimization. Comput Linguist 14(1):31–39
Google Scholar
Deroualt A, Merialdo B (1986) Natural language modeling for phoneme-to-text transcription. IEEE Trans Pattern Anal Mach Intel 8(6):742–749
Article Google Scholar
Emmanuel R, Schabes Y (1995) Deterministic POS tagging with finite-state transducers. Comput Linguist 21(2):227–253
Google Scholar
Greene BB, Rubin GM (1971) Automatic grammatical tagging of English. Department of Linguistics, Brown University, Providence, Rhode Island, Technical report
Hindle D (1989) Acquiring disambiguation rules from text. In: Proceedings of the 27th annual meeting of the association for computational linguistics. Vancouver, British Columbia, pp 118–125
Hunter JJ (1974) Renewal theory in two dimensions: basic results. Adv Appl Probab 6:376–391
Article MATH Google Scholar
Jung M, Bai DS (2007) Analysis of field data under two-dimensional warranty. Reliab Eng Syst Saf 92(2):135–143
Article Google Scholar
Kalbfleisch JD, Lawless JF (1988) Estimation of reliability in field-performance studies. Technometrics 30:365–388
MATH MathSciNet Google Scholar
Kalbfleish JD, Lawless JF, Robinson JA (1991) Methods for the analysis and prediction of warranty claims. Technometrics 33:273–285
Article Google Scholar
Karim MR, Suzuki K (2005) Analysis of warranty claim data: a literature review. Int J Qual Reliab Manag 22(7):667–686
Article Google Scholar
Klein S, Simmons R (1963) A computational approach to grammatical coding of English words. J ACM 10(3):334–347
Article MATH Google Scholar
Ken S (1998) Lazy transformation-based learning. In: Proceedings of the 11th international Florida artificial intelligence research symposium conference, Sanibel Island, Florida, USA, pp 235–239
Kilgarriff A, Rychly P, Smrz P, Tugwell D (2004) The sketch engine. In: Proceedings of Euralex, Lorient, France, pp 105–116
Kleyner A, Sandborn P (2008) Minimizing life cycle cost by managing product dependability via validation plan and warranty return cost. Int J Prod Econ 112(2):796–807
Article Google Scholar
Lawless JF (1983) Statistical methods in reliability. Technometrics 25:305–335
Article MATH MathSciNet Google Scholar
Lawless JF, Hu J, Cao J (1995) Methods for the estimation of failure distributions and rates from automobile warranty data. Lifetime Data Anal 1:227–240
Article MATH Google Scholar
Lawless JF (1998) Statistical analysis of product warranty data. Int Stat Rev 66:41–60
Article MATH Google Scholar
Lund K, Burgess C (1996) Producing high-dimensional semantic spaces using lexical co-occurrence. Behav Res Methods 28(2):203–208
Article Google Scholar
Majeske KD, Caris TL, Herrin G (1997) Evaluating product and process design changes with warranty data. Int J Prod Econ 50:79–89
Article Google Scholar
Majeske KD (2003) A mixture model for automobile warranty data. Reliab Eng and Sys Saf 81:71–77
Article Google Scholar
Meteer M, Schwartz R, Weischedel R (1991) POST: using probabilities in language processing. In: Proceedings of the twelfth international conference on artificial intelligence, pp 960–965
Mukheerje S, Chakraborty A (2007) Automated fault tree generation: bridging reliability with text mining. In: Proceedings of reliability and maintainability symposium, Orlando FL, pp 83–88
Murthy DNP, Blischke WR (1992) Product warranty management—III: a review of mathematical models. Eur J Oper Res 62:1–34
Article Google Scholar
Murthy DNP, Iskandar BP, Wilson RJ (1995) Two-dimensional failure-free warranty policies: two-dimensional point process models. Oper Res 43(2):356–366
Article MATH Google Scholar
Ngai G, Radu F (2001) Transformation-based learning in the fast lane.In; Proceedings of the second conference of the North American chapter of the association for computational linguistics, Pittsburgh, PA, pp 1–8
Oh YS, Bai DS (2001) Field data analyses with additional after warranty field-data. Reliab Eng Syst Saf 72(1):1–8
Article Google Scholar
Radu F, Henderson J C, Ngai G (2000) Coaxing confidences from an old friend: probabilistic classifications from transformation rule lists. In: Proceedings of joint SIGDAT conference on empirical methods in natural language processing and very large corpora, pp 26–34
Rajpathak D, Chougule R (2011) A generic ontology development framework for data integration and decision support in a distributed environment. Int J Comput Integr Manuf 24(2):154–170
Article Google Scholar
Rajpathak D, Chougule R, Bandyopadhyay P (2011) A domain specific decision support system for knowledge discovery using association and text mining. Int J Knowl Inf Syst 31(3):405–432
Article Google Scholar
Rajpathak D (2013) An ontology based text mining system for knowledge discovery from the diagnosis data in the automotive domain. Int J Comput Ind 64(5):565–580
Article Google Scholar
Singpurwalla ND, Wilson SP (1994) Software reliability modeling. Int Stat Rev 62:289–317
Article MATH Google Scholar
Turney P (2001) Mining the web for synonyms: PMI-IR versus LSA on TOEFL. In: Proceedings of the twelfth European conference on machine learning, Freiburg, Germany, pp 491–502
Wasserman GS (1992) An application of dynamic linear models for predicting warranty claims. Comput Ind Eng 22(1):37–47
Article MathSciNet Google Scholar
Wessel F (2002) Word posterior probabilities for large vocabulary continuous speech recognition. Ph.D. thesis, RWTH Aachen University. Aachen, Germany

Download references

Acknowledgments

The authors would like to thank reviewers and GM’s internal paper review committee for providing valuable comments on the earlier drafts of this manuscript.

Author information

Authors and Affiliations

Advanced Quality Analytics, General Motors, Creator Building, International Technology Park, Whitefiled, Bangalore, 560066, Karnataka, India
Dnyanesh Rajpathak & Soumen De

Authors

Dnyanesh Rajpathak
View author publications
You can also search for this author inPubMed Google Scholar
Soumen De
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Dnyanesh Rajpathak.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rajpathak, D., De, S. A data- and ontology-driven text mining-based construction of reliability model to analyze and predict component failures. Knowl Inf Syst 46, 87–113 (2016). https://doi.org/10.1007/s10115-014-0806-3

Download citation

Received: 25 September 2013
Revised: 25 July 2014
Accepted: 07 November 2014
Published: 11 January 2015
Issue Date: January 2016
DOI: https://doi.org/10.1007/s10115-014-0806-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A data- and ontology-driven text mining-based construction of reliability model to analyze and predict component failures

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A data-driven approach for constructing the component-failure mode matrix for FMEA

A Text Understandability Approach for Improving Reliability-Centered Maintenance in Manufacturing Enterprises

Identification of Variables Impacting Cascading Failures in Aerospace Systems: A Natural Language Processing Approach

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now