Abstract
Keyword search is the most popular way to access information. In this paper we introduce a novel approach for determining the correct resources for user-supplied queries based on a hidden Markov model. In our approach the user-supplied query is modeled as the observed data and the background knowledge is used for parameter estimation. We leverage the semantic relationships between resources for computing the parameter estimations. In this approach, query segmentation and resource disambiguation are mutually tightly interwoven. First, an initial set of potential segments is obtained leveraging the underlying knowledge base; then, the final correct set of segments is determined after the most likely resource mapping was computed. While linguistic analysis (e.g. named entity, multi-word unit recognition and POS-tagging) fail in the case of keyword-based queries, we will show that our statistical approach is robust with regard to query expression variance. Our experimental results reveal very promising results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Topic-sensitive pagerank: A context-sensitive ranking algorithm for web search. Technical Report 2003-29 (2003)
Beeferman, D., Berger, A.: Agglomerative clustering of a search engine query log. ACM Press (2000)
Brenes, D.J., Gayo-Avello, D., Garcia, R.: On the fly query entity decomposition using snippets. CoRR, abs/1005.5516 (2010)
Brill, E., Ngai, G.: Man* vs. machine: A case study in base noun phrase learning. ACL (1999)
Chieu, H.L., Ng, H.T.: Named entity recognition: A maximum entropy approach using global information. In: Proceedings COLING 2002 (2002)
Chuang, S.-L., Chien, L.-F.: Towards automatic generation of query taxonomy: A hierarchical query clustering approach. IEEE Computer Society (2002)
Collins, M., Singer, Y.: Unsupervised models for named entity classification. In: SIGDAT Empirical Methods in NLP and Very Large Corpora (1999)
Finkelstein, L., Gabrilovich, E., Matias, Y., Rivlin, E., Solan, Z., Wolfman, G., Ruppin, E.: Placing Search in Context: the Concept Revisited. In: WWW (2001)
Guo, J., Xu, G., Cheng, X., Li, H.: Named entity recognition in query. ACM (2009)
Joachims, T., Granka, L.A., Pan, B., Hembrooke, H., Gay, G.: Accurately interpreting clickthrough data as implicit feedback. In: SIGIR. ACM (2005)
Kelly, D., Teevan, J.: Implicit feedback for inferring user preference: a bibliography. SIGIR Forum 37(2), 18–28 (2003)
Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. J. ACM 46(5) (1999)
Kraft, R., Chang, C.C., Maghoul, F., Kumar, R.: Searching with context. In: WWW 2006: 15th Int. Conf. on World Wide Web. ACM (2006)
Lawrence, S.: Context in web search. IEEE Data Eng. Bull. 23(3), 25–32 (2000)
Pu, K.Q., Yu, X.: Keyword query cleaning. PVLDB 1(1), 909–920 (2008)
Ramshaw, L.A., Marcus, M.P.: Text chunking using transformation-based learning. CoRR (1995)
Risvik, K.M., Mikolajewski, T., Boros, P.: Query segmentation for web search (2003)
Shepitsen, A., Gemmell, J., Mobasher, B., Burke, R.: Personalized recommendation in social tagging systems using hierarchical clustering. ACM (2008)
Tan, B., Peng, F.: Unsupervised query segmentation using generative language models and wikipedia. In: WWW. ACM (2008)
Tan, B., Peng, F.: Unsupervised query segmentation using generative language models and wikipedia. ACM (2008)
Uzuner, A., Katz, B., Yuret, D.: Word sense disambiguation for information retrieval. AAAI Press/The MIT Press (1999)
Vorhees, E.: The trec-8 question answering track report. In: Proceedings of TREC-8 (1999)
Wen, J.-R., Nie, J.-Y., Zhang, H.-J.: Query Clustering Using User Logs. ACM Transactions on Information Systems 20(1) (2002)
White, R.W., Jose, J.M., van Rijsbergen, C.J., Ruthven, I.: A simulated study of implicit feedback models. In: McDonald, S., Tait, J.I. (eds.) ECIR 2004. LNCS, vol. 2997, pp. 311–326. Springer, Heidelberg (2004)
Yu, X., Shi, H.: Query segmentation using conditional random fields. ACM (2009)
Zhu, Y., Callan, J., Carbonell, J.G.: The impact of history length on personalized search. ACM (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Shekarpour, S., Ngonga Ngomo, AC., Auer, S. (2013). Keyword-Driven Resource Disambiguation over RDF Knowledge Bases. In: Takeda, H., Qu, Y., Mizoguchi, R., Kitamura, Y. (eds) Semantic Technology. JIST 2012. Lecture Notes in Computer Science, vol 7774. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37996-3_11
Download citation
DOI: https://doi.org/10.1007/978-3-642-37996-3_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37995-6
Online ISBN: 978-3-642-37996-3
eBook Packages: Computer ScienceComputer Science (R0)