Skip to main content
Log in

Automatic extraction of citation information in Japanese patent applications

  • Regular Paper
  • Published:
International Journal on Digital Libraries Aims and scope Submit manuscript

Abstract

The need for academic researchers to retrieve patents and research papers is increasing, because applying for patents is now considered an important research activity. However, retrieving patents using keywords is a laborious task for researchers, because the terms used in patents for the purpose of enlarging the scope of the claims are generally more abstract than those used in research papers. Therefore, we have constructed a framework that facilitates patent retrieval for researchers, and have integrated research papers and patents by analysing the citation relationships between them. We obtained cited research papers in patents using two steps: (1) detection of sentences containing bibliographic information, and (2) extraction of bibliographic information from those sentences. To investigate the effectiveness of our method, we conducted two experiments. In the experiment involving Step 1, we prepared 42,073 sentences, among which a human subject manually identified 1,476 sentences containing citations of papers. For Step 2, we prepared 3,000 sentences, in which the titles, authors, and other bibliographic information were manually identified. We obtained a precision of 91.6%, and a recall of 86.9% in Step 1, and a precision of 86.2% and a recall of 85.1% in Step 2. Finally, we constructed an information retrieval system that provided two methods of retrieving research papers and patents. One method was retrieval by query, and another was from the citation relationships between research papers and patents.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Baré R.: Results of a statistical study of the references cited in the search reports established by the EPO. World Patent Inf. 3(2), 56–60 (1981)

    Article  Google Scholar 

  2. Borkar, V., Deshmukh, K., Sarawagi, S.: Automatic segmentation of text into structured records. In: Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data, pp. 175–186 (2001)

  3. Fujii, A., Iwayama, M., Kando, N.: Overview of patent retrieval task at NTCIR-4. In: Working Notes of the 4th NTCIR Workshop, pp. 225–232 (2004)

  4. Fujii, A., Iwayama, M., Kando, N.: Overview of patent retrieval task at NTCIR-5. In: Proceedings of the 5th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, Question Answering and Cross-Lingual Information Access, pp. 269–277 (2005)

  5. Fujii, A., Iwayama, M., Kando, N.: Overview of the patent retrieval task at NTCIR-6 workshop. In: Proceedings of the 6th NTCIR Workshop Meeting, pp. 359–365 (2007)

  6. Galhardas, D., Florescu, D., Shasha, D., AJAX: an extensible data cleaning tool. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, p.590 (2000)

  7. Giles, C.L., Bollacker, K., Lawrence, S.: An automatic citation indexing system. In: Proceedings of the 3rd ACM International Conference on Digital Libraries, pp. 89–98 (1998)

  8. Hitchcock, S., Carr, L., Harris, S., Hey, J.M.N., Hall, W.: Citation linking: improving access to online journals. In: Proceedings of the 2nd ACM International Conference on Digital Libraries, pp. 115–122 (1997)

  9. Ikeda, D., Fujiki, T., Okumura, M.: Automatically linking news articles to blog entries. In: Proceedings of AAAI Spring Symposium Series Computational Approaches to Analyzing Weblogs, pp. 78–82 (2006)

  10. Itoh, H., Mano, H., Ogawa, Y.: Term distillation for cross-db retrieval. In: Proceedings of Working Notes of the 3rd NTCIR Workshop Meeting, Part III: Patent Retrieval Task, pp. 11–14 (2002)

  11. Iwayama, M., Fujii, A., Kando, N., Takano, A.: Overview of patent retrieval task at NTCIR-3. In: Proceedings of Working Notes of the 3rd NTCIR Workshop Meeting, Part III: Patent Retrieval Task, pp. 1–10 (2002)

  12. Mase, H., Iwayama, H.: NTCIR-6 patent retrieval experiments at Hitachi. In: Proceedings of the 6th NTCIR Workshop, pp. 403–406 (2007)

  13. Mayer M.: Does science push technology? Patents citing scientific literature. Res. Policy 29, 409–434 (2000)

    Article  Google Scholar 

  14. Nanba, H., Okumura, M.: Towards multi-paper summarization using reference information. In: Proceedings of the 16th IJCAI, pp. 926–931 (1999)

  15. Nanba, H., Kando, N., Okumura, M.: Classification of research papers using citation links and citation types: towards automatic review article generation. In: Proceedings of the American Society for Information Science/the 11th SIG Classification Research Workshop, Classification for User Support and Learning, pp. 117–134 (2000)

  16. Nanba, H., Abekawa, T., Okumura, M., Saito, S.: Bilingual PRESRI: integration of multiple research paper databases. In: Proceedings of RIAO 2004, pp. 195–211 (2004)

  17. Nanno, T., Saito, S., Okumura, M.: Zero-click: a system to support web browsing. In: The 11th International World Wide Web Conference (2002)

  18. Narin F., Olivastro D., Stevens K.A.: Bibliometrics/theory, practice and problems. Evaluat. Rev. 18(1), 65–76 (1994)

    Article  Google Scholar 

  19. Needleman S.B., Wunsch C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Molec. Biol. 48, 443–453 (1970)

    Article  Google Scholar 

  20. Schmoch U., Kirsch N., Lay W., Plescher E., Jung K.O.: Analysis of technical spin-off effects of space-related R&D by means of patent indicators. Acta Astronaut. 24, 353–362 (1991)

    Article  Google Scholar 

  21. Schmoch U.: Tracing the knowledge transfer from science to technology as reflected in patent indicators. Scientometrics 26(1), 193–211 (1993)

    Article  Google Scholar 

  22. Takasu, A.: Bibliographic attribute extraction from erroneous reference based on a statistical model. In: Proceedings of the Third ACM/IEEE-CS Joint Conference on Digital Libraries 2003, pp. 49–60 (2003)

  23. Teufel S., Moens M.: Summarizing Scientific articles—experiments with relevance and rhetorical status. Comput. Linguist. 28(4), 409–445 (2002)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hidetsugu Nanba.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nanba, H., Anzen, N. & Okumura, M. Automatic extraction of citation information in Japanese patent applications. Int J Digit Libr 9, 151–161 (2008). https://doi.org/10.1007/s00799-008-0045-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00799-008-0045-x

Keywords

Navigation