Abstract
This research introduces a new query expansion method that uses Wikipedia and its hyperlink structure to find related terms for reformulating a query. Queries are first understood better by splitting into query aspects. Further understanding is gained through measuring how well each aspect is represented in the original search results. Poorly represented aspects are found to be an excellent source of query improvement. Our main contribution is the way of using Wikipedia to identify aspects and underrepresented aspects, and to weight the expansion terms. Results have shown that our approach improves the original query and search results, and outperforms two existing query expansion methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bernhard, D.: Query expansion based on pseudo relevance feedback from definition clusters. In: Proceedings of the 23rd International Conference on Computational Linguistics: Posters, COLING 2010, pp. 54–62. Association for Computational Linguistics, Stroudsburg (2010)
Bhogal, J., Macfarlane, A., Smith, P.: A review of ontology based query expansion. Inf. Process. Manage. 43, 866–886 (2007)
Brenes, D.J., Gayo-Avello, D.: Stratified analysis of aol query log. Inf. Sci. 179, 1844–1858 (2009)
Clarke, C.L., Craswell, N., Soboroff, I.: Overview of the trec 2009 web track (2010)
Clarke, C.L., Craswell, N., Soboroff, I., Cormack, G.V.: Overview of the trec 2010 web track (2011)
Clarke, C.L., Craswell, N., Soboroff, I., Voorhees, E.M.: Overview of the trec 2011 web track (2012)
Crabtree, D.W., Andreae, P., Gao, X.: Exploiting underrepresented query aspects for automatic query expansion. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2007, pp. 191–200. ACM, New York (2007)
Fellbaum, C. (ed.): WordNet An Electronic Lexical Database. The MIT Press, Cambridge (1998)
Jiang, X.: Query expansion based on a semantic graph model. In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information, SIGIR 2011, pp. 1315–1316. ACM, New York (2011)
Klyuev, V., Haralambous, Y.: Query expansion: Term selection using the ewc semantic relatedness measure. CoRR, abs/1108.4052 (2011) (informal publication)
Manning, C., Raghavan, P., Schütze, H.: Introduction to information retrieval. Cambridge University Press (2008)
Meij, E., de Rijke, M.: Supervised query modeling using wikipedia. In: Proceeding of the 33rd International ACM SIGIR conference on Research and Development in Information Retrieval, SIGIR 2010, pp. 875–876. ACM, New York (2010)
Milne, D., Witten, I.H.: An effective, low-cost measure of semantic relatedness obtained from wikipedia links. In: Proceedings of AAAI 2008 (2008)
Milne, D., Witten, I.H.: Learning to link with wikipedia. In: Proceeding of the 17th ACM Conference on Information and Knowledge Management, CIKM 2008, pp. 509–518. ACM, New York (2008)
Milne, D.N., Witten, I.H., Nichols, D.M.: A knowledge-based search engine powered by wikipedia. In: Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management, CIKM 2007, pp. 445–454. ACM, New York (2007)
Nguyen, D., Callan, J.: Combination of evidence for effective web search. In: The Nineteenth Text REtrieval Conference Proceedings TREC 2010 (2010)
Robertson, G., Gao, X.: Improving abraq: An automatic query expansion algorithm. In: Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, WI-IAT 2010, vol. 1, pp. 653–656. IEEE Computer Society, Washington, DC (2010)
Ruthven, I., Lalmas, M.: A survey on the use of relevance feedback for information access systems. Knowl. Eng. Rev. 18, 95–145 (2003)
SantamarÃa, C., Gonzalo, J., Artiles, J.: Wikipedia as sense inventory to improve diversity in web search results. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, ACL 2010, pp. 1357–1366. Association for Computational Linguistics, Stroudsburg (2010)
Strohman, T., Metzler, D., Turtle, H., Croft, W.B.: Indri: a language-model based search engine for complex queries. In: Proceedings of the International Conference on Intelligent Analysis, Technical report (2005)
White, R.W., Marchionini, G.: Examining the effectiveness of real-time query expansion. Inf. Process. Manage. 43, 685–704 (2007)
Xu, Y., Jones, G.J., Wang, B.: Query dependent pseudo-relevance feedback based on wikipedia. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2009, pp. 59–66. ACM, New York (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bruce, C., Gao, X., Andreae, P., Jabeen, S. (2012). Query Expansion Powered by Wikipedia Hyperlinks. In: Thielscher, M., Zhang, D. (eds) AI 2012: Advances in Artificial Intelligence. AI 2012. Lecture Notes in Computer Science(), vol 7691. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35101-3_36
Download citation
DOI: https://doi.org/10.1007/978-3-642-35101-3_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35100-6
Online ISBN: 978-3-642-35101-3
eBook Packages: Computer ScienceComputer Science (R0)