Abstract
The objective of the 2009 CLEF-IP Track was to find documents that constitute prior art for a given patent. We explored a wide range of simple pre-processing and post-processing strategies, using Mean Average Precision (MAP) for evaluation purposes. Once determined the best document representation, we tuned a classical Information Retrieval engine in order to perform the retrieval step. Finally, we explored two different post-processing strategies. In our experiments, using the complete IPC codes for filtering purposes led to greater improvements than using 4-digits IPC codes. The second post-processing strategy was to exploit the citations of retrieved patents in order to boost scores of cited patents. Combining all selected strategies, we computed optimal runs that reached a MAP of 0.122 for the training set, and a MAP of 0.129 for the official 2009 CLEF-IP XL set.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Augstein, J.: Down with the Patent Lobby or how the European Patent Office has mutated to controlling engine of the European Economy, Diploma Thesis, University of Linz (2008)
Piroi, F., Roda, G., Zenz, V.: CLEF-IP 2009, Track Guidelines (2009)
Ounis, I., Lioma, C., Macdonald, C., Plachouras, V.: Research Directions in Terrier: a Search Engine for Advanced Retrieval on the Web. Novatica/UPGRADE Special Issue on Next Generation Web Search 8, 49–56 (2007)
Tseng, Y.-H., Wu, Y.J.: A Study of Search Tactics for Patentability Search – a Case Study on Patent Engineers. In: Proceedings of the 1st ACM Workshop on Patent Information Retrieval (2008)
http://ir.dcs.gla.ac.uk/terrier/doc/configure_retrieval.html
Sternitzke, C.: Reducing uncertainty in the patent application procedure – insights from malicious prior art in European patent applications. World patent Information 31, 48–53 (2009)
Criscuolo, P., Verspagen, B.: Does it matter where patent citations come from? Inventor versus examiner citations in European patents. Research Policy 37, 1892–1908 (2008)
http://www.epo.org/patents/patent-information/ipc-reform/faq/levels.html
Li, X., Chen, H., Zhang, Z., Li, J.: Automatic patent classification using citation network information: an experimental study in nanotechnology. In: Proceedings of the 7th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 419–427 (2007)
Gobeill, J., Teodoro, D., Pasche, E., Ruch, P.: Report on the TREC 2009 Experiments: Chemical IR Track. In: TREC 2009 (2009)
Lupu, M., Piroi, F., Tait, J., Huang, J., Zhu, J.: Overview of the TREC 2009 Chemical IR Track. In: TREC 2009 (2009)
Teodoro, D., Gobeill, J., Pasche, E., Ruch, P.: Report on the NTCIR 2010 Experiments: automatic IPC encoding and novelty detection for effective patent mining. In: NTCIR 2010 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gobeill, J., Pasche, E., Teodoro, D., Ruch, P. (2010). Simple Pre and Post Processing Strategies for Patent Searching in CLEF Intellectual Property Track 2009. In: Peters, C., et al. Multilingual Information Access Evaluation I. Text Retrieval Experiments. CLEF 2009. Lecture Notes in Computer Science, vol 6241. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15754-7_53
Download citation
DOI: https://doi.org/10.1007/978-3-642-15754-7_53
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15753-0
Online ISBN: 978-3-642-15754-7
eBook Packages: Computer ScienceComputer Science (R0)