ABSTRACT
Keyword queries over structured databases are notoriously ambiguous. No single interpretation of a keyword query can satisfy all users, and multiple interpretations may yield overlapping results. This paper proposes a scheme to balance the relevance and novelty of keyword search results over structured databases. Firstly, we present a probabilistic model which effectively ranks the possible interpretations of a keyword query over structured data. Then, we introduce a scheme to diversify the search results by re-ranking query interpretations, taking into account redundancy of query results. Finally, we propose α-nDCG-W and WS-recall, an adaptation of α-nDCG and S-recall metrics, taking into account graded relevance of subtopics. Our evaluation on two real-world datasets demonstrates that search results obtained using the proposed diversification algorithms better characterize possible answers available in the database than the results of the initial relevance ranking.
- Agrawal, R., Gollapudi, S., Halverson, A., & Leong, S. Diversifying Search Results. WSDM 2009. Google ScholarDigital Library
- Carbonell, J., & Goldstein, J. The use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries. In Proceedings of the SIGIR 1998. Google ScholarDigital Library
- Chakaravarthy, V. T., Gupta, H., Roy, P., & Mohania, M. Efficiently Linking Text Documents with Relevant Structured Information. In Proceedings of the VLDB 2006. Google ScholarDigital Library
- Chen, H., & Karger, D. R. Less is More. Probabilistic Models for Retrieving Fewer Relevant Documents. SIGIR'06 Google ScholarDigital Library
- Chen, Z., & Li, T. Addressing Diverse User Preferences in SQL-Query-Result Navigation. SIGMOD 2007 Google ScholarDigital Library
- Clarke, C. L., Kolla, M., Cormack, G. V., Vechtomova, O., Ashkan, A., Büttcher, S., MacKinnon, I. Novelty and Diversity in Information Retrieval Evaluation. SIGIR 2008. Google ScholarDigital Library
- Clough, P., Sanderson, M., Abouammoh, M., Navarro, S., Paramita, M.: Multiple Approaches to Analysing Query Diversity. In Proceedings of SIGIR2009. Google ScholarDigital Library
- Demidova, E., Zhou, X. & Nejdl, W. IQP: Incremental Query Construction, a Probabilistic Approach. In ICDE 2010.Google Scholar
- Gollapudi, S., Sharma, A. An Axiomatic Approach for Result Diversification. In Proceedings of WWW 2009. Google ScholarDigital Library
- Hearst, M. A. Clustering versus Faceted Categories for Information Exploration. Commun, ACM 49, April 2006. Google ScholarDigital Library
- Hristidis, V., Gravano, L., Papakonstantinou, Y. Efficient IR-Style Keyword Search over Relational Databases. VLDB 03. Google ScholarDigital Library
- Järvelin, K., & Kekäläinen, J. Cumulated Gain-based Evaluation of IR Techniques. ACM Trans. Inf. Syst., 2002. Google ScholarDigital Library
- Kandogan, E., Krishnamurthy, R., Raghavan, S., Vaithyanathan, S., & Zhu, H. Avatar Semantic Search: A Database Approach to Information retrieval. SIGMOD 2006. Google ScholarDigital Library
- Liu, B., & Jagadish, H. V. Using Trees to Depict a Forest. In Proceedings of the VLDB 2009. Google ScholarDigital Library
- Manning, C. D., Raghavan, P. and Schütze, H. Introduction to Information Retrieval, Cambridge University Press. 2008. Google ScholarCross Ref
- Tata, S., & Lohman, G. M. SQAK: doing more with keywords. In Proceedings of the SIGMOD 2008. Google ScholarDigital Library
- Tran, T., Cimiano, P., Rudolph, S., & Studer, R. Ontology-based Interpretation of Keywords for Semantic Search. In Proceedings of the ISWC 2007. Google ScholarDigital Library
- vanLeuken, R., Pueyo, L., Olivares, X., & Zwol, R. Visual Diversification of Image Search Results. WWW 2009. Google ScholarDigital Library
- Vee, E., Srivastava, U., & Shanmugasund, J. Efficient Computation of Diverse Query Results. ICDE 2008. Google ScholarDigital Library
- Wang, J., & Zhu, J. Portfolio Theory of Information Retrieval. In Proceedings of the SIGIR 2009. Google ScholarDigital Library
- Zhou, Q., Wang, C., Xiong, M., Wang, H., Yu, Y. SPARK: Adapting Keyword Query to Semantic Search. ISWC 2007. Google ScholarDigital Library
Index Terms
- DivQ: diversification for keyword search over structured databases
Recommendations
Impact of query intent and search context on clickthrough behavior in sponsored search
Implicit feedback techniques may be used for query intent detection, taking advantage of user behavior to understand their interests and preferences. In sponsored search, a primary concern is the user's interest in purchasing or utilizing a commercial ...
Characterizing commercial intent
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge managementUnderstanding the intent underlying user's queries may help personalize search results and therefore improve user satisfaction. We develop a methodology for using the content of search engine result pages (SERPs) along with the information obtained from ...
Detecting Intent of Web Queries Using Questions and Answers in CQA Corpus
WI-IAT '11: Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 01Detecting intent in Web search activity is important task for finding relevant Web information. However extracting intents from users' queries is difficult as users express their intent by issuing short and often ambiguous queries, yet at the same time ...
Comments