Abstract
Vector space techniques can be used for extracting semantically similar words from the co-occurrence statistics of words in large text data collections. We have used a technique called Random Indexing to accumulate context vectors for Swedish, French and Italian. We have then used the context vectors to perform automatic query expansion. In this paper, we report on our CLEF 2002 experiments on Swedish, French and Italian monolingual query expansion.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
R. Bayer and K. Unterauer. Prefix B-trees. ACM Transactions on Database Systems, 2(1):11–26, March 1977. 314
S. Deerwester, S. Dumais, G. Furnas, T. Landauer, and R. Harshman. Indexing by latent semantic analysis. Journal of the Society for Information Science, 41(6):391–407, 1990. 312
M. J. Folk, B. Zoellick, and G. Riccardi. File Structures: An Object-Oriented Approach with C++. Addison-Wesley, 3rd edition, 1998. 314
Z. Harris. Mathematical Structures of Language. Interscience publishers, 1968. 313
P. Kanerva, J. Kristofersson, and A. Holst. Random indexing of text samples for latent semantic analysis. In Proceedings of the 22nd Annual Conference of the Cognitive Science Society, page 1036. Erlbaum, 2000. 312
J. Karlgren and M. Sahlgren. From words to understanding. In Y. Uesaka, P. Kanerva, and H. Asoh, editors, Foundations of Real World Intelligence, pages 294-308. CSLI publications, 2001. 312, 313
S. Kaski. Dimensionality reduction by random mapping: Fast similarity computation for clustering. In Proceedings of the IJCNN’98, International Joint Conference on Neural Networks, pages 413-418. IEEE Service Center, 1998. 312
T. Landauer and S. Dumais. A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction and representation of knowledge. Psychological Review, 104(2):211–240, 1997. 312
K. Lund and C. Burgess. Producing high-dimensional semantic spaces from lexical co-occurrence. Behavior Research Methods, Instruments and Computers, 28(2):203–208, 1996. 312
C. Monz, J. Kamps, and M. de Rijke. Combining Evidence for Cross-language Information Retrieval. This volume. 318
Y. Qiu and H.P. Frei. Concept based query expansion. In Proceedings of the 16th ACM SIGIR Conference on Research and Development in Information Retrieval, pages 160-169, 1993. 317
H. E. Williams and J. Zobel. Compressing integers for fast file access. The Computer Journal, 42(3):193–201, 1999. 315
I. H. Witten, A. Moffat, and T. C. Bell. Managing Gigabytes: Compressing and Indexing Documents and Images. Morgan Kaufmann Publishing, 2nd edition, 1999. 314, 315
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sahlgren, M., Karlgren, J., Cöster, R., Järvinen, T. (2003). SICS at CLEF 2002: Automatic Query Expansion Using Random Indexing. In: Peters, C., Braschler, M., Gonzalo, J., Kluck, M. (eds) Advances in Cross-Language Information Retrieval. CLEF 2002. Lecture Notes in Computer Science, vol 2785. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45237-9_26
Download citation
DOI: https://doi.org/10.1007/978-3-540-45237-9_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40830-7
Online ISBN: 978-3-540-45237-9
eBook Packages: Springer Book Archive