skip to main content
10.1145/1835449.1835506acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

DivQ: diversification for keyword search over structured databases

Published:19 July 2010Publication History

ABSTRACT

Keyword queries over structured databases are notoriously ambiguous. No single interpretation of a keyword query can satisfy all users, and multiple interpretations may yield overlapping results. This paper proposes a scheme to balance the relevance and novelty of keyword search results over structured databases. Firstly, we present a probabilistic model which effectively ranks the possible interpretations of a keyword query over structured data. Then, we introduce a scheme to diversify the search results by re-ranking query interpretations, taking into account redundancy of query results. Finally, we propose α-nDCG-W and WS-recall, an adaptation of α-nDCG and S-recall metrics, taking into account graded relevance of subtopics. Our evaluation on two real-world datasets demonstrates that search results obtained using the proposed diversification algorithms better characterize possible answers available in the database than the results of the initial relevance ranking.

References

  1. Agrawal, R., Gollapudi, S., Halverson, A., & Leong, S. Diversifying Search Results. WSDM 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Carbonell, J., & Goldstein, J. The use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries. In Proceedings of the SIGIR 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Chakaravarthy, V. T., Gupta, H., Roy, P., & Mohania, M. Efficiently Linking Text Documents with Relevant Structured Information. In Proceedings of the VLDB 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Chen, H., & Karger, D. R. Less is More. Probabilistic Models for Retrieving Fewer Relevant Documents. SIGIR'06 Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Chen, Z., & Li, T. Addressing Diverse User Preferences in SQL-Query-Result Navigation. SIGMOD 2007 Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Clarke, C. L., Kolla, M., Cormack, G. V., Vechtomova, O., Ashkan, A., Büttcher, S., MacKinnon, I. Novelty and Diversity in Information Retrieval Evaluation. SIGIR 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Clough, P., Sanderson, M., Abouammoh, M., Navarro, S., Paramita, M.: Multiple Approaches to Analysing Query Diversity. In Proceedings of SIGIR2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Demidova, E., Zhou, X. & Nejdl, W. IQP: Incremental Query Construction, a Probabilistic Approach. In ICDE 2010.Google ScholarGoogle Scholar
  9. Gollapudi, S., Sharma, A. An Axiomatic Approach for Result Diversification. In Proceedings of WWW 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Hearst, M. A. Clustering versus Faceted Categories for Information Exploration. Commun, ACM 49, April 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Hristidis, V., Gravano, L., Papakonstantinou, Y. Efficient IR-Style Keyword Search over Relational Databases. VLDB 03. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Järvelin, K., & Kekäläinen, J. Cumulated Gain-based Evaluation of IR Techniques. ACM Trans. Inf. Syst., 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Kandogan, E., Krishnamurthy, R., Raghavan, S., Vaithyanathan, S., & Zhu, H. Avatar Semantic Search: A Database Approach to Information retrieval. SIGMOD 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Liu, B., & Jagadish, H. V. Using Trees to Depict a Forest. In Proceedings of the VLDB 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Manning, C. D., Raghavan, P. and Schütze, H. Introduction to Information Retrieval, Cambridge University Press. 2008. Google ScholarGoogle ScholarCross RefCross Ref
  16. Tata, S., & Lohman, G. M. SQAK: doing more with keywords. In Proceedings of the SIGMOD 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Tran, T., Cimiano, P., Rudolph, S., & Studer, R. Ontology-based Interpretation of Keywords for Semantic Search. In Proceedings of the ISWC 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. vanLeuken, R., Pueyo, L., Olivares, X., & Zwol, R. Visual Diversification of Image Search Results. WWW 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Vee, E., Srivastava, U., & Shanmugasund, J. Efficient Computation of Diverse Query Results. ICDE 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Wang, J., & Zhu, J. Portfolio Theory of Information Retrieval. In Proceedings of the SIGIR 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Zhou, Q., Wang, C., Xiong, M., Wang, H., Yu, Y. SPARK: Adapting Keyword Query to Semantic Search. ISWC 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. DivQ: diversification for keyword search over structured databases

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
      July 2010
      944 pages
      ISBN:9781450301534
      DOI:10.1145/1835449

      Copyright © 2010 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 19 July 2010

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      SIGIR '10 Paper Acceptance Rate87of520submissions,17%Overall Acceptance Rate792of3,983submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader