Skip to main content

IPC Selection Using Collection Selection Algorithms

  • Conference paper
Multidisciplinary Information Retrieval (IRFC 2014)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8849))

Included in the following conference series:

Abstract

In this paper we view the automated selection of patent classification codes as a collection selection problem that can be addressed using existing methods which we extend and adapt for the patent domain. Our work exploits the manually assigned International Patent Classification (IPC) codes of patent documents to cluster, distribute and index patents through hundreds or thousands of sub-collections. We examine different collection selection methods (CORI, Bordafuse, ReciRank and multilayer) and compare their effectiveness in selecting relevant IPCs. The multilayer method, in addition to utilizing the topical relevance of IPCs at a specific level (e.g. sub-class), exploits the topical relevance of their ancestors in the IPC hierarchy and aggregates those multiple estimations of relevance to a single estimation. The results show that multilayer outperforms CORI and fusion-based methods in the task of IPC suggestion.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 34.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 44.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Adams, S.: Using the International Patent Classification in an online environment. World Pat. Inf. 22(4), 291–300 (2000)

    Article  Google Scholar 

  2. Aslam, J.A., Montague, M.: Models for meta search. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 276–284. ACM, New York (2001)

    Google Scholar 

  3. Cai, L., Hofmann, T.: Hierarchical Document Categorization with Support Vector Machines. In: Proceedings of the Thirteenth ACM International Conference on Information and Knowledge Management, pp. 78–87. ACM, New York (2004)

    Google Scholar 

  4. Callan, J., Connell, M.: Query-based sampling of text databases. ACM Trans. Inf. Syst. 19(2), 97–130 (2001)

    Article  Google Scholar 

  5. Callan, J., Lu, Z., Croft, W.B.: Searching distributed collections with inference networks. In: Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 21–28. ACM, New York (1995)

    Google Scholar 

  6. Chakrabarti, S., Dom, B., Indyk, P.: Enhanced hypertext categorization using hyperlinks. In: Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, pp. 307–318. ACM, New York (1998)

    Chapter  Google Scholar 

  7. Chen, Y.-L., Chang, Y.-C.: A three-phase method for patent classification. Inf. Process. Manag. 48(6), 1017–1030 (2012)

    Article  MathSciNet  Google Scholar 

  8. D’hondt, E., Verberne, S., Koster, C.H.A., Boves, L.: Text Representations for Patent Classification. Comput. Linguist. 39(3), 755–775 (2013)

    Article  Google Scholar 

  9. Fall, C.J., Törcsvári, A., Benzineb, K., Karetka, G., Torcsvari, A.: Automated categorization in the international patent classification. SIGIR Forum 37(1), 10–25 (2003)

    Article  Google Scholar 

  10. French, J.C., Powell, A.L., Callan, J., Viles, C.L., Emmit, T., Prey, K.J., Mon, Y.: Comparing the performance of database selection algorithms. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 1999), pp. 238–245. ACM Press (1999)

    Google Scholar 

  11. Fuhr, N.: A decision-theoretic approach to database selection in networked IR. ACM Trans. Inf. Syst. 17(3), 229–249 (1999)

    Article  Google Scholar 

  12. Gey, F., Buckland, M., Chen, A., Larson, R.: Entry Vocabulary – a Technology to Enhance Digital Search. In: Proccedings of the 1st International Conference on Human Language Technology, pp. 91–95 (2001)

    Google Scholar 

  13. Giachanou, A., Salampasis, M., Paltoglou, G.: Multilayer Collection Selection and Search of Topically Organized Patents. Integrating IR Technologies for Professional Search (2013)

    Google Scholar 

  14. Giachanou, A., Salampasis, M., Satratzemi, M., Samaras, N.: Report on the CLEF-IP 2013 Experiments: Multilayer Collection Selection on Topically Organized Patents. CLEF (Online Working Notes/Labs/Workshop) (2013)

    Google Scholar 

  15. Kohonen, T., Kaski, S., Lagus, K., Salojarvi, J., Honkela, J., Paatero, V., Saarela, A.: Self organization of a massive document collection. IEEE Trans. Neural Networks 11(3), 574–585 (2000)

    Article  Google Scholar 

  16. Kosmopoulos, A., Gaussier, E., Paliouras, G., Aseervatham, S.: The ECIR 2010 large scale hierarchical classification workshop. ACM SIGIR Forum 44(1), 23–52 (2010)

    Google Scholar 

  17. Larkey, L.S.: A patent search and classification system. In: Proceedings of the Fourth ACM Conference on Digital Libraries, pp. 179–187. ACM, New York (1999)

    Chapter  Google Scholar 

  18. Larkey, L.S.: Some issues in the automatic classification of US patents. Working Notes for the Workshop on Learning for Text Categorization, Madison, Wisconsin (1998)

    Google Scholar 

  19. Lupu, M., Hanbury, A.: Patent Retrieval. Found. Trends Inf. Retr. 7(1), 1–97 (2013)

    Article  Google Scholar 

  20. Markov, I., Azzopardi, L., Crestani, F.: Reducing the uncertainty in resource selection. In: Serdyukov, P., Braslavski, P., Kuznetsov, S.O., Kamps, J., Rüger, S., Agichtein, E., Segalovich, I., Yilmaz, E. (eds.) ECIR 2013. LNCS, vol. 7814, pp. 507–519. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  21. Paltoglou, G., Salampasis, M., Satratzemi, M.: A results merging algorithm for distributed information retrieval environments that combines regression methodologies with a selective download phase. Inf. Process. Manag. 44(4), 1580–1599 (2008)

    Article  Google Scholar 

  22. Paltoglou, G., Salampasis, M., Satratzemi, M.: Modeling information sources as integrals for effective and efficient source selection. Inf. Process. Manag. 47(1), 18–36 (2011)

    Article  Google Scholar 

  23. Paltoglou, G., Salampasis, M., Satratzemi, M.: Simple Adaptations of Data Fusion Algorithms for Source Selection. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 497–508. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  24. Powell, A.L., French, J.C.: Comparing the performance of collection selection algorithms. ACM Trans. Inf. Syst. 21(4), 412–456 (2003)

    Article  Google Scholar 

  25. Salampasis, M., Paltoglou, G., Giahanou, A.: Report on the CLEF-IP 2012 Experiments: Search of Topically Organized Patents. In: Forner, P., Karlgren, J., Womser-Hacker, C. (eds.) CLEF (Online Working Notes/Labs/Workshop) (2012)

    Google Scholar 

  26. Si, L., Callan, J.: A semisupervised learning method to merge search engine results. ACM Trans. Inf. Syst. 21(4), 457–491 (2003)

    Article  Google Scholar 

  27. Si, L., Jin, R., Callan, J., Ogilvie, P.: A language modeling framework for resource selection and results merging. In: Proceedings of the Eleventh International Conference on Information and Knowledge Management, pp. 391–397. ACM Press (2002)

    Google Scholar 

  28. Tikk, D., Biró, G., Törcsvári, A.: A hierarchical online classifier for patent categorization. In: do Prado, H.A., Ferneda, E. (eds.) Emerging Technologies of Text Mining. IGI Global (2007)

    Google Scholar 

  29. Vijvers, W.G.W.: The international patent classification as a search tool. World Pat. Inf. 12(1), 26–30 (1990)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Giachanou, A., Salampasis, M. (2014). IPC Selection Using Collection Selection Algorithms. In: Lamas, D., Buitelaar, P. (eds) Multidisciplinary Information Retrieval. IRFC 2014. Lecture Notes in Computer Science, vol 8849. Springer, Cham. https://doi.org/10.1007/978-3-319-12979-2_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-12979-2_4

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-12978-5

  • Online ISBN: 978-3-319-12979-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics