skip to main content
10.1145/2808194.2809446acmconferencesArticle/Chapter ViewAbstractPublication PagesictirConference Proceedingsconference-collections
research-article

Query Expansion with Freebase

Published: 27 September 2015 Publication History

Abstract

Large knowledge bases are being developed to describe entities, their attributes, and their relationships to other entities. Prior research mostly focuses on the construction of knowledge bases, while how to use them in information retrieval is still an open problem. This paper presents a simple and effective method of using one such knowledge base, Freebase, to improve query expansion, a classic and widely studied information retrieval task. It investigates two methods of identifying the entities associated with a query, and two methods of using those entities to perform query expansion. A supervised model combines information derived from Freebase descriptions and categories to select terms that are effective for query expansion. Experiments on the ClueWeb09 dataset with TREC Web Track queries demonstrate that these methods are almost 30% more effective than strong, state-of-the-art query expansion algorithms. In addition to improving average performance, some of these methods have better win/loss ratios than baseline algorithms, with 50% fewer queries damaged.

References

[1]
FACC1 Annotation on ClueWeb09. http://lemurproject.org/clueweb09/FACC1/. Accessed: 2014-06-26.
[2]
M. Banko, M. J. Cafarella, S. Soderland, M. Broadhead, and O. Etzioni. Open information extraction for the web. In Processing of the Internaltional Joint Conference on Artifical Intelligence, IJCAI(2007), volume 7, pages 2670--2676. IJCAI, 2007.
[3]
M. Bendersky, D. Fisher, and W. B. Croft. Umass at Trec 2010 Web Track: Term dependence, spam filtering and quality bias. In Proceedings of The 19th Text REtrieval Conference, (TREC 2010). NIST, 2010.
[4]
G. Cao, J.-Y. Nie, J. Gao, and S. Robertson. Selecting good expansion terms for pseudo-relevance feedback. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, (SIGIR 2008), pages 243--250. ACM, 2008.
[5]
A. Carlson, J. Betteridge, B. Kisiel, B. Settles, E. R. Hruschka Jr, and T. M. Mitchell. Toward an architecture for never-ending language learning. In Proceedings of the 24th AAAI Conference on Artificial Intelligence, (AAAI 2010), volume 5, page 3. AAAI Press, 2010.
[6]
D. Carmel, M.-W. Chang, E. Gabrilovich, B.-J. P. Hsu, and K. Wang. ERD'14: Entity recognition and disambiguation challenge. In SIGIR '14: Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval. ACM, 2014.
[7]
C.-C. Chang and C.-J. Lin. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1--27:27, 2011. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
[8]
K. Collins-Thompson. Robust model estimation methods for information retrieval. PhD thesis, Carnegie Mellon University, December 2008.
[9]
K. Collins-Thompson. Estimating robust query models with convex optimization. In Proceedings of the 21st Advances in Neural Information Processing Systems, (NIPS 2009), pages 329--336. NIPS, 2009.
[10]
K. Collins-Thompson. Reducing the risk of query expansion via robust constrained optimization. In Proceedings of the 18th ACM Conference on Information and Knowledge Management, (CIKM 2009), pages 837--846. ACM, 2009.
[11]
K. Collins-Thompson and J. Callan. Estimation and use of uncertainty in pseudo-relevance feedback. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, (SIGIR 2007), pages 303--310. ACM, 2007.
[12]
G. V. Cormack, M. D. Smucker, and C. L. Clarke. Efficient and effective spam filtering and re-ranking for large web datasets. Information retrieval, 14(5):441--465, 2011.
[13]
J. Dalton, L. Dietz, and J. Allan. Entity query feature expansion using knowledge base links. In Proceedings of the 37th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, (SIGIR 2014), pages 365--374. ACM, 2014.
[14]
N. Dong and C. Jamie. Combination of evidence for effective web search. In Proceedings of The 19th Text REtrieval Conference, (TREC 2010). NIST, 2010.
[15]
A. Kotov and C. Zhai. Tapping into knowledge base for concept feedback: Leveraging ConceptNet to improve search results for difficult queries. In Proceedings of the Fifth ACM International Conference on Web Search and Data Mining, pages 403--412. ACM, 2012.
[16]
R. Krovetz. Viewing morphology as an inference process. In Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, (SIGIR 1993), pages 191--202. ACM, 1993.
[17]
V. Lavrenko and W. B. Croft. Relevance based language models. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, (SIGIR 2001), pages 120--127. ACM, 2001.
[18]
K. S. Lee, W. B. Croft, and J. Allan. A cluster-based resampling method for pseudo-relevance feedback. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, (SIGIR 2008), pages 235--242. ACM, 2008.
[19]
J. Lehmann, R. Isele, M. Jakob, A. Jentzsch, D. Kontokostas, P. N. Mendes, S. Hellmann, M. Morsey, P. van Kleef, S. Auer, and C. Bizer. DBpedia - a large-scale, multilingual knowledge base extracted from Wikipedia. Semantic Web Journal, 2014.
[20]
D. Metzler. Beyond bags of words: effectively modeling dependence and features in information retrieval. PhD thesis, University of Massachusetts Amherst, September 2007.
[21]
D. Metzler and W. B. Croft. A markov random field model for term dependencies. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, (SIGIR 2005), pages 472--479. ACM, 2005.
[22]
D. Metzler and W. B. Croft. Latent concept expansion using markov random fields. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, (SIGIR 2007), pages 311--318. ACM, 2007.
[23]
D. Pan, P. Zhang, J. Li, D. Song, J.-R. Wen, Y. Hou, B. Hu, Y. Jia, and A. De Roeck. Using dempster-shafer's evidence theory for query expansion based on freebase knowledge. In Information Retrieval Technology, pages 121--132. Springer, 2013.
[24]
S. E. Robertson and S. Walker. Okapi/keenbow at TREC-8. In Proceedings of The 8th Text REtrieval Conference, (TREC 1999), pages 151--162. NIST, 1999.
[25]
T. Strohman, D. Metzler, H. Turtle, and W. B. Croft. Indri: A language model-based search engine for complex queries. In Proceedings of the International Conference on Intelligent Analysis, volume 2, pages 2--6. Citeseer, 2005.
[26]
T. Tao and C. Zhai. Regularized estimation of mixture models for robust pseudo-relevance feedback. In Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, (SIGIR 2006), pages 162--169. ACM, 2006.
[27]
Y. Xu, G. J. Jones, and B. Wang. Query dependent pseudo-relevance feedback based on wikipedia. In Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, (SIGIR 2009), pages 59--66. ACM, 2009.
[28]
C. Zhai and J. Lafferty. Model-based feedback in the language modeling approach to information retrieval. In Proceedings of the 10th ACM Conference on Information and Knowledge Management, (CIKM 2001), pages 403--410. ACM, 2001.

Cited By

View all
  • (2025)Knowledge graph based entity selection framework for ad-hoc retrievalWeb Semantics: Science, Services and Agents on the World Wide Web10.1016/j.websem.2024.10084884:COnline publication date: 18-Feb-2025
  • (2024)Diabetic Patient Diagnosis through the use of Machine Learning Techniques2024 5th International Conference on Mobile Computing and Sustainable Informatics (ICMCSI)10.1109/ICMCSI61536.2024.00073(466-469)Online publication date: 18-Jan-2024
  • (2024)Bayesian hypernetwork collaborates with time-difference evolutional network for temporal knowledge predictionNeural Networks10.1016/j.neunet.2024.106146175(106146)Online publication date: Jul-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICTIR '15: Proceedings of the 2015 International Conference on The Theory of Information Retrieval
September 2015
402 pages
ISBN:9781450338332
DOI:10.1145/2808194
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 September 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. freebase
  2. knowledge base
  3. pseudo relevance feedback
  4. query expansion

Qualifiers

  • Research-article

Funding Sources

Conference

ICTIR '15
Sponsor:

Acceptance Rates

ICTIR '15 Paper Acceptance Rate 29 of 57 submissions, 51%;
Overall Acceptance Rate 235 of 527 submissions, 45%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)26
  • Downloads (Last 6 weeks)3
Reflects downloads up to 16 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Knowledge graph based entity selection framework for ad-hoc retrievalWeb Semantics: Science, Services and Agents on the World Wide Web10.1016/j.websem.2024.10084884:COnline publication date: 18-Feb-2025
  • (2024)Diabetic Patient Diagnosis through the use of Machine Learning Techniques2024 5th International Conference on Mobile Computing and Sustainable Informatics (ICMCSI)10.1109/ICMCSI61536.2024.00073(466-469)Online publication date: 18-Jan-2024
  • (2024)Bayesian hypernetwork collaborates with time-difference evolutional network for temporal knowledge predictionNeural Networks10.1016/j.neunet.2024.106146175(106146)Online publication date: Jul-2024
  • (2024)Event-Specific Document Ranking Through Multi-stage Query Expansion Using an Event Knowledge GraphAdvances in Information Retrieval10.1007/978-3-031-56060-6_22(333-348)Online publication date: 16-Mar-2024
  • (2023)Conversational Context-sensitive Ad Generation with a Few Core-QueriesACM Transactions on Interactive Intelligent Systems10.1145/358857813:3(1-37)Online publication date: 11-Sep-2023
  • (2023)Entity-Based Relevance Feedback for Document RetrievalProceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3578337.3605128(177-187)Online publication date: 9-Aug-2023
  • (2023)Generative Relevance Feedback with Large Language ModelsProceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3539618.3591992(2026-2031)Online publication date: 19-Jul-2023
  • (2023)A User-Devised Search Query and Clustering Technique for Searching Through Research Papers2023 15th International Congress on Advanced Applied Informatics Winter (IIAI-AAI-Winter)10.1109/IIAI-AAI-Winter61682.2023.00065(318-321)Online publication date: 11-Dec-2023
  • (2023)Transformer-based Temporal Knowledge Graph Completion2023 IEEE 3rd International Conference on Computer Communication and Artificial Intelligence (CCAI)10.1109/CCAI57533.2023.10201286(443-448)Online publication date: 26-May-2023
  • (2023)Adaptive pseudo-Siamese policy network for temporal knowledge predictionNeural Networks10.1016/j.neunet.2023.01.004160(192-201)Online publication date: Mar-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media