Abstract
We propose an FAQ (Frequently Asked Question) search method that uses classification results of input queries. FAQs aim at covering frequently asked topics and users usually search topics in FAQs with queries represented by bag-of-words or natural language sentences. However, there is a problem that each question in FAQs is not usually sufficient enough to cover variety of queries that have the similar meaning but different surface expressions, such as synonyms, paraphrase and causal relations due to each topic usually consists of a representative question and its answer. As a result, users who cannot find their answers in FAQs ask a call center operator. To consider similarity of meaning among different surface expressions, we use a document classifier that classifies each query into topics of FAQs. A document classifier is trained with not only FAQs but also corresponding histories of operators for covering variety of queries. However, corresponding histories do not include links to FAQs, we use a method for generating training data from the corresponding histories with FAQs. To generate training data correctly, the method takes advantage of a characteristic that many answers in corresponding histories related to FAQs are created by quoting corresponding FAQs. Our method uses a surface similarity between answers in corresponding histories and the answer part of each topic in FAQs for automatically generating training data. Experimental results show that our method outperforms an FAQ search based method using word matching in terms of Mean Reciprocal Rank and Precision@N.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Brown, P.F., Pietra, V.J.D., Pietra, S.A.D., Mercer, R.L.: The mathematics of statistical machine translation: parameter estimation. Comput. Linguist. (1993)
Burke, R., Hammond, K., Kulyukin, V., Lytinen, S., Tomuro, N., Schoenberg, S.: Natural language processing in the FAQ finder system: results and prospects. In: Working Notes from AAAI Spring Symposium on NLP on the WWW (1997)
Cao, X., Cong, G., Cui, B., Jensen, C.S.: A generalized framework of exploring category information for question retrieval in community question answer archives. In: Proceedings of the WWW (2010)
Cao, X., Cong, G., Cui, B., Jensen, C.S., Zhang, C.: The use of categorization information in language models for question retrieval. In: Proceedings of CIKM (2009)
Crammer, K., Kulesza, A., Dredze, M.: Adaptive regularization of weight vectors. In: Proceedings of NIPS (2010)
Higashinaka, R., Isozaki, H.: Corpus-based question answering for why-questions. In: Proceedings of IJCNLP (2008)
Jeon, J., Croft, W.B., Lee, J.H.: Finding similar questions in large question and answer archives. In: Proceedings of CIKM (2005)
Jijkoun, V., de Rijke, M.: Retrieving answers from frequently asked questions pages on the web. In: Proceedings of CIKM (2005)
Ko, J., Mitamura, T., Nyberg, E.: Language-independent probabilistic answer ranking for question answering. In: Proceedings of ACL (2007)
Riezler, S., Vasserman, A., Tsochantaridis, I., Mittal, V., Liu, Y.: Statistical machine translation for query expansion in answer retrieval. In: Proceedings of ACL (2007)
Soricut, R., Brill, E.: Automatic question answering using the web: beyond the factoid. Inf. Retr. 9, 191–206 (2006)
Surdeanu, M., Ciaramita, M., Zaragoza, H.: Learning to rank answers on large online QA collections. In: Proceedings of ACL (2008)
Xue, X., Jeon, J., Croft, W.B.: Retrieval models for question and answer archives. In: Proceedings of SIGIR (2008)
Zhou, G., Liu, Y., Liu, F., Zeng, D., Zhao, J.: Improving question retrieval in community question answering using world knowledge. In: Proceedings of IJCAI (2013)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Makino, T., Noro, T., Iwakura, T. (2016). An FAQ Search Method Using a Document Classifier Trained with Automatically Generated Training Data. In: Booth, R., Zhang, ML. (eds) PRICAI 2016: Trends in Artificial Intelligence. PRICAI 2016. Lecture Notes in Computer Science(), vol 9810. Springer, Cham. https://doi.org/10.1007/978-3-319-42911-3_25
Download citation
DOI: https://doi.org/10.1007/978-3-319-42911-3_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-42910-6
Online ISBN: 978-3-319-42911-3
eBook Packages: Computer ScienceComputer Science (R0)