Abstract
Users often search for entities instead of documents and in this setting are willing to provide extra input, in addition to a query, such as category information and example entities. We propose a general probabilistic framework for entity search to evaluate and provide insight in the many ways of using these types of input for query modeling. We focus on the use of category information and show the advantage of a category-based representation over a term-based representation, and also demonstrate the effectiveness of category-based expansion using example entities. Our best performing model shows very competitive performance on the INEX-XER entity ranking and list completion tasks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Balog, K.: People Search in the Enterprise. PhD thesis, University of Amsterdam (2008)
Balog, K., Azzopardi, L., de Rijke, M.: Formal models for expert finding in enterprise corpora. In: SIGIR 2006, pp. 43–50 (2006)
Balog, K., Weerkamp, W., de Rijke, M.: A few examples go a long way. In: SIGIR 2008, pp. 371–378 (2008)
Balog, K., Soboroff, I., Thomas, P., Craswell, N., de Vries, A.P., Bailey, P.: Overview of the TREC 2008 enterprise track. In: TREC 2008, NIST (2009)
Chu-Carroll, J., Czuba, K., Prager, J., Ittycheriah, A., Blair-Goldensohn, S.: IBM’s PIQUANT II in TREC 2004. In: Proceedings TREC 2004 (2004)
Conrad, J., Utt, M.: A system for discovering relationships by feature extraction from text databases. In: SIGIR 1994, pp. 260–270 (1994)
Craswell, N., Demartini, G., Gaugaz, J., Iofciu, T.: L3S at INEX2008: retrieving entities using structured information. In: Geva, et al. (eds.) [12], pp. 253–263
de Vries, A., Vercoustre, A.-M., Thom, J.A., Craswell, N., Lalmas, M.: Overview of the INEX 2007 entity ranking track. In: Fuhr, et al. (eds.) [11], pp. 245–251
Demartini, G., de Vries, A., Iofciu, T., Zhu, J.: Overview of the INEX 2008 entity ranking track. In: Geva, et al. (eds.) [12], pp. 243–252
Fissaha Adafre, S., de Rijke, M., Tjong Kim Sang, E.: Entity retrieval. In: Recent Advances in Natural Language Processing (RANLP 2007) (September 2007)
Fuhr, N., Kamps, J., Lalmas, M., Trotman, A. (eds.): INEX 2007. LNCS, vol. 4862. Springer, Heidelberg (2008)
Geva, S., Kamps, J., Trotman, A. (eds.): INEX 2008. LNCS, vol. 5631. Springer, Heidelberg (2009)
Ghahramani, Z., Heller, K.A.: Bayesian sets. In: NIPS 2005 (2005)
GoogleSets (2009), http://labs.google.com/sets (accessed January 2009)
Jämsen, J., Näppilä, T., Arvola, P.: Entity ranking based on category expansion. In: Fuhr, et al. (eds.) [11], pp. 264–278
Jiang, J., Liu, W., Rong, X., Gao, Y.: Adapting language modeling methods for expert search to rank wikipedia entities. In: Geva, et al. (eds.) [12], pp. 264–272
Kaptein, R., Kamps, J.: Finding entities in wikipedia using links and categories. In: Geva, et al. (eds.) [12], pp. 273–279
Lafferty, J., Zhai, C.: Document language models, query models, and risk minimization for information retrieval. In: SIGIR 2001, pp. 111–119 (2001)
Losada, D., Azzopardi, L.: An analysis on document length retrieval trends in language modeling smoothing. Information Retrieval 11(2), 109–138 (2008)
Mishne, G., de Rijke, M.: A study of blog search. In: Lalmas, M., MacFarlane, A., Rüger, S.M., Tombros, A., Tsikrika, T., Yavlinsky, A. (eds.) ECIR 2006. LNCS, vol. 3936, pp. 289–301. Springer, Heidelberg (2006)
Raghavan, H., Allan, J., Mccallum, A.: An exploration of entity models, collective classification and relation description. In: Link KDD 2004 (2004)
Rose, D.E., Levinson, D.: Understanding user goals in web search. In: WWW 2004, pp. 13–19 (2004)
Sayyadian, M., Shakery, A., Doan, A., Zhai, C.: Toward entity retrieval over structured and text data. In: WIRD 2004 (2004)
Song, F., Croft, W.B.: A general language model for information retrieval. In: CIKM 1999, pp. 316–321 (1999)
Tsikrika, T., Serdyukov, P., Rode, H., Westerveld, T., Aly, R., Hiemstra, D., de Vries, A.P.: Structured document retrieval, multimedia retrieval, and entity ranking using PF/Tijah. In: Fuhr, et al. (eds.) [11], pp. 306–320
Vercoustre, A.-M., Pehcevski, J., Thom, J.A.: Using wikipedia categories and links in entity ranking. In: Fuhr, et al. (eds.) [11], pp. 321–335
Vercoustre, A.-M., Thom, J.A., Pehcevski, J.: Entity ranking in wikipedia. In: SAC 2008, pp. 1101–1106 (2008)
Vercoustre, A.-M., Pehcevski, J., Naumovski, V.: Topic difficulty prediction in entity ranking. In: Geva, et al. (eds.) [12], pp. 280–291
Voorhees, E.: Overview of the TREC 2004 question answering track. In: Proceedings of TREC 2004 (2005) NIST Special Publication: SP 500–261
Weerkamp, W., He, J., Balog, K., Meij, E.: A generative language modeling approach for ranking entities. In: Geva, et al. (eds.) [12], pp. 292–299
Yilmaz, E., Kanoulas, E., Aslam, J.A.: A simple and efficient sampling method for estimating AP and NDCG. In: SIGIR 2008, pp. 603–610 (2008)
Zaragoza, H., Rode, H., Mika, P., Atserias, J., Ciaramita, M., Attardi, G.: Ranking very many typed entities on wikipedia. In: CIKM 2007, pp. 1015–1018 (2007)
Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to information retrieval. ACM Trans. Inf. Syst. 22(2), 179–214 (2004)
Zhu, J., Song, D., Rüger, S.: Integrating document features for entity ranking. In: Fuhr, et al. (eds.) [11], pp. 336–347
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Balog, K., Bron, M., de Rijke, M. (2010). Category-Based Query Modeling for Entity Search. In: Gurrin, C., et al. Advances in Information Retrieval. ECIR 2010. Lecture Notes in Computer Science, vol 5993. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12275-0_29
Download citation
DOI: https://doi.org/10.1007/978-3-642-12275-0_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12274-3
Online ISBN: 978-3-642-12275-0
eBook Packages: Computer ScienceComputer Science (R0)