skip to main content
10.1145/2484028.2484056acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Leveraging conceptual lexicon: query disambiguation using proximity information for patent retrieval

Published:28 July 2013Publication History

ABSTRACT

Patent prior art search is a task in patent retrieval where the goal is to rank documents which describe prior art work related to a patent application. One of the main properties of patent retrieval is that the query topic is a full patent application and does not represent a focused information need. This query by document nature of patent retrieval introduces new challenges and requires new investigations specific to this problem. Researchers have addressed this problem by considering different information resources for query reduction and query disambiguation. However, previous work has not fully studied the effect of using proximity information and exploiting domain specific resources for performing query disambiguation.

In this paper, we first reduce the query document by taking the first claim of the document itself. We then build a query-specific patent lexicon based on definitions of the International Patent Classification (IPC). We study how to expand queries by selecting expansion terms from the lexicon that are focused on the query topic. The key problem is how to capture whether an expansion term is focused on the query topic or not. We address this problem by exploiting proximity information. We assign high weights to expansion terms appearing closer to query terms based on the intuition that terms closer to query terms are more likely to be related to the query topic.

Experimental results on two patent retrieval datasets show that the proposed method is effective and robust for query expansion, significantly outperforming the standard pseudo relevance feedback (PRF) and existing baselines in patent retrieval.

References

  1. A. Arampatzis and J. Kamps. A signal-to-noise approach to score normalization. In CIKM, pages 797--806, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. L. Azzopardi and V. Vinay. Retrievability: an evaluation measure for higher order information access tasks. In CIKM, pages 561--570, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. S. Bashir and A. Rauber. Improving retrievability of patents in prior-art search. In ECIR, pages 457--470, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. S. Cetintas and L. Si. Effective query generation and postprocessing strategies for prior art patent search. JASIST, 63(3):512--527, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. D. Ganguly, J. Leveling, W. Magdy, and G. J. F. Jones. Patent query reduction based on pseudo-relevant documents. In CIKM, pages 1953--1956, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Gerani, M. J. Carman, and F. Crestani. Aggregation methods for proximity-based opinion retrieval. TOIS, 30(4):26, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. J.-H. Lee. Analyses of multiple evidence combination. In SIGIR, pages 267--276, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. P. Lopez and L. Romary. Patatras: Retrieval model combination and regression models for prior art search. In CLEF (Notebook Papers/LABs/Workshops), pages 430--437, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. P. Lopez and L. Romary. Experiments with citation mining and key-term extraction for prior art search. CLEF (Notebook Papers/LABs/Workshops), 2010.Google ScholarGoogle Scholar
  10. M. Lupu and A. Hanbury. Patent retrieval. Foundations and Trends® in Information Retrieval, 7(1):1--97, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. M. Lupu, K. Mayer, J. Tait, and A. Trippe. Current Challenges in Patent Information Retrieval. Springer, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Y. Lv and C. Zhai. Positional language models for information retrieval. In SIGIR, pages 299--306, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Y. Lv and C. Zhai. Positional relevance model for pseudo-relevance feedback. In SIGIR, pages 579--586, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. W. Magdy and G. J. F. Jones. PRES: A score metric for evaluating recall-oriented information retrieval applications. In SIGIR, pages 611--618, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. W. Magdy and G. J. F. Jones. A study on query expansion methods for patent retrieval. In PAIR 2011 - CIKM, pages 19--24, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. P. Mahdabi, L. Andersson, M. Keikha, and F. Crestani. Automatic refinement of patent queries using concept importance predictors. In SIGIR, pages 505--514, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. P. Sondhi, V. G. V. Vydiswaran, and C. Zhai. Reliability prediction of webpages in the medical domain. In ECIR, pages 219--231, 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. X. Xue and W. B. Croft. Automatic query generation for patent search. CKIM, pages 2037--2040, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. X. Yin, X. Huang, and Z. Li. Promoting ranking diversity for biomedical information retrieval using wikipedia. In ECIR, pages 495--507, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. C. Zhai and J. D. Lafferty. A study of smoothing methods for language models applied to ad hoc information retrieval. In SIGIR, pages 334--342, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Leveraging conceptual lexicon: query disambiguation using proximity information for patent retrieval

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGIR '13: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
        July 2013
        1188 pages
        ISBN:9781450320344
        DOI:10.1145/2484028

        Copyright © 2013 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 28 July 2013

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        SIGIR '13 Paper Acceptance Rate73of366submissions,20%Overall Acceptance Rate792of3,983submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader