Abstract
Lexical disagreement problems often occur in FAQ retrieval because FAQs unlike general documents consist of just one or two sentences. To resolve lexical disagreement problems, we propose a high-performance FAQ retrieval system using query log clustering. During indexing time, using latent semantic analysis techniques, the proposed system classifies and groups the logs of users’ queries into predefined FAQ categories. During retrieval time, the proposed system uses the query log clusters as a form of FAQ smoothing. In our experiment, we found that the proposed system could resolve some lexical disagreement problems between queries and FAQs.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
El-Hamdouchi, A., Willet, P.: Comparison of Hierarchic Agglomerative Clustering Methods for Document Retrieval. The Computer Journal 32(3), 220–227 (1989)
Hammond, K., Burke, R., Martin, C., Lytinen, S.: FAQ Finder: a Case-Based Approach to Knowledge Navigation. In: Proceedings of the 11th Conference on Artificial Intelligence for Applications, pp. 80–86 (1995)
Hearst, M.A., Pedersen, J.O.: Re-examining the Cluster Hypothesis: Scatter/Gather on Retrieval Results. In: Proceedings of SIGIR 1996, pp. 76–84 (1996)
Jardine, N., van Rijsbergen, C.J.: The Use of Hierarchical Clustering in Information Retrieval. Information Storage and Retrieval 7, 217–240 (1971)
Landauer, T.K., Foltz, P.W., Laham, D.: Introduction to Latent Semantic Analysis. Discourse Processes 25, 259–284 (1998)
Lee, S.: A Korean Part-of-Speech Tagging System with Handling Unknown Words, (in Korean) MS thesis, KAIST, Korea (1992)
Lemur-3.0. The Lemur Toolkit for Language Modeling and Information Retrieval (Version 3.0). Copyright (c) 2000–2004 Carnegie Mellon University (2000–2004)
Liu, X., Croft, W.B.: Cluster-Based Retrieval Using Language Models. In: Proceedings of SIGIR 2004, pp. 25–29 (2004)
Maarek, Y.S., Berry, D.M., Kaiser, G.E.: An Information Retrieval Approach for Automatically Construction Software Libraries. IEEE Transaction on Software Engineering 17(8), 800–813 (1991)
Miller, G.: WordNet: An On-Line Lexical Database. International Journal of Lexicography 3(4), 1–12 (1990)
Muller, J., Pischel, M.: Doing Business in the Information Marketplace. In: Proceedings of 1999 International Conference on Autonomous Agents, pp. 139–146 (1999)
van Rijsbergen, C.J.: Information Retrieval, 2nd edn. Butterworths, London (1979)
van Rijsbergen, C.J., Croft, W.B.: Document Clustering: An Evaluation of Some Experiments with the Cranfield 1400 Collection. Information Processing and Management 11, 171–182 (1975)
Robertson, S.E., Walker, S., Jones, S., Beaulieu, M.M., Gatford, M.: Okapi at TREC–3. In: Proceedings of TREC–3, pp. 109–126 (1994)
Salton, G., McGill, M.J.: Introduction to Modern Information Retrieval (Computer Series). McGraw-Hill, New York (1983)
Sneiders, E.: Automated FAQ Answering: Continued Experience with Shallow Language Understanding. In: Papers from the 1999 AAAI Fall Symposium, pp. 97–107 (1999)
Tombros, A., Villa, R., van Rijsbergen, C.J.: The Effectiveness of Query-specific Hierarchic Clustering in Information Retrieval. Information Processing and Management 38, 559–582 (2002)
Voorhees, E.: The Cluster Hypothesis Revisited. In: Proceedings of SIGIR 1985, pp. 188–196 (1985)
Voorhees, E., Tice, D.M.: The TREC-8 Question Answering Track Evaluation. In: Proceedings of TREC-8, pp. 83–105 (1999)
Whitehead, S.D.: Auto-FAQ: an Experiment in Cyberspace Leveraging. Computer Networks and ISDN Systems 28(1-2), 137–146 (1995)
Willet, P.: Recent Trends in Hierarchical Document Clustering: A Critical Review. Information Processing and Management 24(5), 577–597 (1988)
Zhai, C., Lafferty, J.: A Study of Smoothing Methods for Language Models Applied to Ad hoc Information Retrieval. In: Proceedings of SIGIR 2001, pp. 334–342 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kim, H., Lee, H., Seo, J. (2005). Improving FAQ Retrieval Using Query Log Clustering in Latent Semantic Space. In: Lee, G.G., Yamada, A., Meng, H., Myaeng, S.H. (eds) Information Retrieval Technology. AIRS 2005. Lecture Notes in Computer Science, vol 3689. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11562382_18
Download citation
DOI: https://doi.org/10.1007/11562382_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29186-2
Online ISBN: 978-3-540-32001-2
eBook Packages: Computer ScienceComputer Science (R0)