Skip to main content

Effective FAQ Retrieval and Question Matching Tasks with Unsupervised Knowledge Injection

  • Conference paper
  • First Online:
  • 1229 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12848))

Abstract

Frequently asked question (FAQ) retrieval, with the purpose of providing information on frequent questions or concerns, has far-reaching applications in many areas like e-commerce services, online forums and many others, where a collection of question-answer (Q-A) pairs compiled a priori can be employed to retrieve an appropriate answer in response to a user’s query that is likely to reoccur frequently. To this end, predominant approaches to FAQ retrieval typically rank question-answer pairs by considering either the similarity between the query and a question (q-Q), the relevance between the query and the associated answer of a question (q-A), or combining the clues gathered from the q-Q similarity measure and the q-A relevance measure. In this paper, we extend this line of research by combining the clues gathered from the q-Q similarity measure and the q-A relevance measure, and meanwhile injecting extra word interaction information, distilled from a generic (open-domain) knowledge base, into a contextual language model for inferring the q-A relevance. Furthermore, we also explore to capitalize on domain-specific topically-relevant relations between words in an unsupervised manner, acting as a surrogate to the supervised domain-specific knowledge base information. As such, it enables the model to equip sentence representations with the knowledge about domain-specific and topically-relevant relations among words, thereby providing a better q-A relevance measure. We evaluate variants of our approach on a publicly-available Chinese FAQ dataset (viz. TaipeiQA), and further apply and contextualize it to a large-scale question-matching task (viz. LCQMC), which aims to search questions from a QA dataset that have a similar intent as an input query. Extensive experimental results on these two datasets confirm the promising performance of the proposed approach in relation to some state-of-the-art ones.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Karan, M., Šnajder, J.: Paraphrase-focused learning to rank for domain-specific frequently asked questions retrieval. Expert Syst. Appl. 91, 418–433 (2018)

    Article  Google Scholar 

  2. Robertson, S., Zaragoza, H.: The probabilistic relevance framework: BM25 and beyond. Found. Trends Inf. Retrieval 3(4), 333–389 (2009)

    Article  Google Scholar 

  3. Salton, G., Wong, A., Yang, C.: A vector space model for automatic indexing. Commun. ACM 18(11), 613–620 (1975)

    Article  Google Scholar 

  4. Devlin, J., Chang, M.-W., et al.: Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. pp. 4171–4186 (2019)

    Google Scholar 

  5. Peters, M., Neumann, M., et al.: Deep contextualized word representations. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 2227–2237 (2018)

    Google Scholar 

  6. Yang, Z., Dai, Z., et al.: XLNet: generalized autoregressive pretraining for language understanding. In: Proceedings of Conference on Neural Information Processing Systems, pp. 5753–5763 (2019)

    Google Scholar 

  7. Sakata, W., Shibata, T., et al.: FAQ retrieval using query-question similarity and BERT-based query-answer relevance. In: Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1113–1116 (2019)

    Google Scholar 

  8. Liu, W., Zhou, P., et al.: K-BERT: enabling language representation with knowledge graph. In: Pro-ceedings of the AAAI Conference on Artificial Intelligence AAAI, pp. 2901–2908 (2020)

    Google Scholar 

  9. Hofmann, T.: Probabilistic latent semantic analysis. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence, pp. 289–296 (1999)

    Google Scholar 

  10. Liu, X., Chen, Q., et al.: LCQMC: a large-scale Chinese question matching corpus. In: Proceedings of the International Conference on Computational Linguistics, pp. 1952–1962 (2018)

    Google Scholar 

  11. Vaswani, A., Shazeer, N., et al.: Attention is all you need. In: Proceedings of Conference on Neural Information Processing Systems, pp. 5998–6008 (2017)

    Google Scholar 

  12. Cui, W., Xiao, Y., et al.: KBQA: learning question answering over QA corpora and knowledge bases. Proc. VLDB Endowment 10(5), 656–676 (2017)

    Article  Google Scholar 

  13. Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)

    Article  Google Scholar 

  14. Dong, Z., Dong, Q., Hao, C.: HowNet and its computation of meaning. In: Proceedings of the International Conference on Computational Linguistics, pp. 53–56 (2010)

    Google Scholar 

  15. Suchanek, F.M., Kasneci, G., Weikum, G.: YAGO: a core of semantic knowledge. In: Proceedings of the International Conference on World Wide Web, pp. 697–706 (2007)

    Google Scholar 

  16. Zhang, Z., Xu, H., et al.: ERNIE: enhanced language representation with informative entities. In: Proceedings of the Annual Meeting of the Association for Computational Linguistics, pp. 1441–1451 (2019)

    Google Scholar 

  17. Yao, L., Mao, C., Luo, Y.: KG-BERT: BERT for knowledge graph completion (2019). arXiv preprint arXiv:1909.03193

  18. Ji, S., Pan,S., et al.: A survey on knowledge graphs: representation, acquisition and applications (2020). arXiv preprint arXiv:2002.00388

  19. Wang, Q., Mao, Z., et al.: Knowledge graph embedding: a survey of approaches and applications. IEEE Trans. Knowl. Data Eng. 29(12), 2724–2743 (2017)

    Article  Google Scholar 

  20. Ji, G., He,S., et al.: Knowledge graph embedding via dynamic mapping matrix. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (volume 1: Long papers), pp. 687–696 (2015)

    Google Scholar 

  21. Ji, G., Liu, K., et al.: Knowledge graph completion with adaptive sparse transfer matrix. In: Thirtieth AAAI Conference on Artificial Intelligence (2016)

    Google Scholar 

  22. Sokolova, M., Lapalme, G.: A systematic analysis of performance measures for classification tasks. Inf. Process. Manage. 45(4), 427–437 (2009)

    Article  Google Scholar 

  23. Kiritchenko, S., Matwin, S., Nock, R., Famili, A.F.: Learning and evaluation in the presence of class hierarchies: application to text categorization. In: Lamontagne, L., Marchand, M. (eds.) AI 2006. LNCS (LNAI), vol. 4013, pp. 395–406. Springer, Heidelberg (2006). https://doi.org/10.1007/11766247_34

    Chapter  Google Scholar 

  24. Wang, Z., Zhang, J., et al.: Knowledge graph and text jointly embedding. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1591–1601 (2014)

    Google Scholar 

  25. Peters, M.E., Mark, N., et al.: Deep contextualized word representations (2018). arXiv preprint arXiv:1802.05365

Download references

Acknowledgment

This research is supported in part by ASUS AICS and the Ministry of Science and Technology (MOST), Taiwan, under Grant Number MOST 109-2634-F-008-006- through Pervasive Artificial Intelligence Research (PAIR) Labs, Taiwan, and Grant Numbers MOST 108-2221-E-003-005-MY3 and MOST 109-2221-E-003-020-MY3. Any findings and implications in the paper do not necessarily reflect those of the sponsors.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Wen-Ting Tseng , Yung-Chang Hsu or Berlin Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tseng, WT., Hsu, YC., Chen, B. (2021). Effective FAQ Retrieval and Question Matching Tasks with Unsupervised Knowledge Injection. In: Ekštein, K., Pártl, F., Konopík, M. (eds) Text, Speech, and Dialogue. TSD 2021. Lecture Notes in Computer Science(), vol 12848. Springer, Cham. https://doi.org/10.1007/978-3-030-83527-9_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-83527-9_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-83526-2

  • Online ISBN: 978-3-030-83527-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics