Abstract
This paper presents the participation of the CACAO prototype to the Log Analysis for Digital Societies (LADS) task of LogCLEF 2009 track. In our experiment we investigated the possibility to exploit the TEL logs data as a source for inferring new translations, thus enriching already existing translation dictionaries. The proposed approach is based on the assumption that in the context of a multilingual digital library the same query is likely to be repeated across different languages. We applied our approach to the logs from TEL and the results obtained are quite promising.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Hofmann, K., de Rijke, M., Huurnink, B., Meij, E.: A Semantic Perspective on Query Log Analysis Working Notes for the CLEF 2009 Workshop (2009)
Zaragoza, H.: Search and Content Analysis in the Web 2.0 Invited Talk at the Search and Content Analysis in the Web 2.0. In: WWW 2009 Workshop (2009)
Jijkoun, V., Khalid, M.A., Marx, M., de Rijke, M.: Named Entity Normalization in User Generated Content. In: AND 2008: Proceedings of the Second Workshop on Analytics for Noisy Unstructured Text Data (2008)
Google Translate, http://translate.google.com/
Jansen, B.J.: Search log analysis: What it is, what’s been done, how to do it. Library & Information Science Research 28(3), 407–432 (2006)
Wang, X., Zhai, C.: Mining term association patterns from search logs for effective query reformulation. In: CIKM 2008: Proceeding of the 17th ACM Conference on Information and Knowledge Mining, pp. 479-488 (2008)
Li, X., Nguyen, P., Zweig, G., Bohus, D.: Leveraging multiple query logs to improve language models for spoken query recognition. In: IEEE International Conference on Acoustics, Speech and Signal Processing (2009)
Andrejko, A., Barla, M., Bielikov, M., Tvaroek, M.: User characteristics acquisition from logs with semantics. In: ISIM 2007 Information Systems and Formal Models 10th International Conference on Information System Implementation and Modeling, pp. 103–110 (2007)
Bosca, A., Dini, L.: The role of logs in improving cross language access in digital libaries. In: Proceedings of the International Conference on Semantic Web and Digital Libraries (2009)
Bosca, A., Dini, L.: Query expansion via library classification systems. In: Peters, C., Deselaers, T., Ferro, N., Gonzalo, J., Jones, G.J.F., Kurimo, M., Mandl, T., Peñas, A., Petras, V. (eds.) CLEF 2008. LNCS, vol. 5706, pp. 42–49. Springer, Heidelberg (2009)
Lucene. The Lucene search engine, http://jakarta.apache.org/lucene/
At-Mokhtar, S., Chanod, J.-P., Roux, C.: Robustness beyond shallowness: incremental dependency parsing NLE Journal (2002)
Sahlgren, M.: An Introduction to Random Indexing. In: Proceedings of the Methods and Applications of Semantic Indexing Workshop at the 7th International Conference on Terminology and Knowledge Engineering, Copenhagen, Denmark (2005)
Bosca, A., Dini, L.: CACAO Project at the LogCLEF Track. In: the Working Notes of Log File Analysis (LogCLEF) Track at CLEF 2009 (2009)
Baroni, M., Bisi, S.: Using co-occurrence statistics and the web to discover synonyms in technical languag. In: Proceedings of 4th International Conference on Language Resources and Evaluation (LREC), pp. 1725–1728 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bosca, A., Dini, L. (2010). User Logs as a Means to Enrich and Refine Translation Dictionaries. In: Peters, C., et al. Multilingual Information Access Evaluation I. Text Retrieval Experiments. CLEF 2009. Lecture Notes in Computer Science, vol 6241. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15754-7_67
Download citation
DOI: https://doi.org/10.1007/978-3-642-15754-7_67
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15753-0
Online ISBN: 978-3-642-15754-7
eBook Packages: Computer ScienceComputer Science (R0)