Abstract
This paper explores techniques that discover terms to replace given query terms from a selected subset of documents. The Internet allows access to large numbers of documents archived in digital format. However, no user can be an expert in every field, and they trouble finding the documents that suit their purposes experts when they cannot formulate queries that narrow the search to the context they have in mind. Accordingly, we propose a method for extracting terms from searched documents to replace user-provided query terms. Our results show that our method is successful in discovering terms that can be used to narrow the search.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley, Reading (1999)
Church, K., Gale, W.: Inverse document frequency (IDF): A measure of deviations from poisson. In: Proc. of 3rd Workshop on Very Large Corpora, pp. 121–130 (1995)
Eguchi, K., Oyama, K., Ishida, E., Kando, N., Kuriyama, K.: Overview of the Web retrieval task at the third NTCIR workshop. In: Proc. of NTCIR-3, pp. 1–24 (2003)
Fang, H., Tao, T., Zhai, C.: A formal study of information retrieval heuristics. In: Proc. of SIGIR 2004, pp. 49–56 (2004)
Hisamitsu, T., Niwa, Y., Nishioka, S., Sakurai, H., Imaichi, O., Iwayama, M., Takano, A.: Extracting terms by a combination of term frequency and a measure of term representativeness. Terminology 6(2), 211–232 (2001)
ipadic-2.5.1, http://chasen.naist.jp/stable/ipadic/
Matsuo, Y., Ishizuka, M.: Keyword extraction from a single document using word co-ocurrence statistical information. International Journal on Artificial Intelligence Tools 13, 157–169 (2004)
MeCab., http://mecab.sourceforge.jp/
Rennie, J., Jaakkola, T.: Using term informativeness for named entity detection. In: Proc. of SIGIR 2005, pp. 353–360 (2005)
Robertson, S.E.: On term selection for query expansion. Journal of Documentation 46(4), 359–364 (1990)
Toyoda, M., Kitsuregawa, M., Mano, H., Itoh, H., Ogawa, Y.: University of Tokyo/RICOH at NTCIR-3 Web retrieval task. In: Proc. of NTCIR-3, pp. 31–38 (2003)
TREC. trec_eval, http://trec.nist.gov/trec_eval
Yang, Y., Pedersen, J.: A comparative study on feature selection in text categorization. In: Proc. of ICML 1997, pp. 412–420 (1997)
Yoshioka, M., Haraguchi, M.: Study on the combination of probabilistic and boolean ir models for www documents retrieval. Working Notes of NTCIR-4(Supplement Volume), 9–16 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wakaki, H., Masada, T., Takasu, A., Adachi, J. (2006). A New Measure for Query Disambiguation Using Term Co-occurrences. In: Corchado, E., Yin, H., Botti, V., Fyfe, C. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2006. IDEAL 2006. Lecture Notes in Computer Science, vol 4224. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11875581_108
Download citation
DOI: https://doi.org/10.1007/11875581_108
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45485-4
Online ISBN: 978-3-540-45487-8
eBook Packages: Computer ScienceComputer Science (R0)