Loading [a11y]/accessibility-menu.js
Automatic generation of question answer pairs from noisy case logs | IEEE Conference Publication | IEEE Xplore

Automatic generation of question answer pairs from noisy case logs


Abstract:

In a customer support scenario, a lot of valuable information is recorded in the form of `case logs'. Case logs are primarily written for future references or manual insp...Show More

Abstract:

In a customer support scenario, a lot of valuable information is recorded in the form of `case logs'. Case logs are primarily written for future references or manual inspections and therefore are written in a hasty manner and are very noisy. In this paper, we propose techniques that exploit these case logs to mine real customer concerns or problems and then map them to well written knowledge articles for that enterprise. This mapping results into generation of question-answer (QA) pairs. These QA pairs can be used for a variety of applications such as dynamically updating the frequently-asked-questions (FAQs), updating the knowledge repository etc. In this paper we show the utility of these discovered QA pairs as training data for a question-answering system. Our approach for mining the case logs is based on a composite model consisting of two generative models, viz, hidden Markov model (HMM) and latent Dirichlet allocation (LDA) model. The LDA model explains the long-range dependencies across words due to their semantic similarity and HMM models the sequential patterns present in these case logs. Such processing results in crisp `problem statement' segments which are indicative of the real customer concerns. Our experiments show that this approach finds crisp problem-statements in 56% of the cases and outperforms other alternate methods for segmentation such as HMM, LDA and conditional random field (CRF). After finding these crisp problem-statements, appropriate answers are looked up from an existing knowledge repository index forming candidate QA pairs. We show that considering only the problemstatement segments for which the answers can be found further improves the segmentation performance to 82%. Finally, we show that when these QA pairs are used as training data, the performance of a question-answering system can be improved significantly.
Date of Conference: 31 March 2014 - 04 April 2014
Date Added to IEEE Xplore: 19 May 2014
Electronic ISBN:978-1-4799-2555-1

ISSN Information:

Conference Location: Chicago, IL, USA

Contact IEEE to Subscribe

References

References is not available for this document.