skip to main content
10.1145/1255175.1255262acmconferencesArticle/Chapter ViewAbstractPublication PagesjcdlConference Proceedingsconference-collections
Article

Automatic patent classification using citation network information: an experimental study in nanotechnology

Published: 18 June 2007 Publication History

Abstract

Classifying and organizing documents in repositories is an active research topic in digital library studies. Manually classifying the large volume of patents and patent applications managed by patent offices is a labor-intensive task. Many previous studies have employed patent contents for patent classification with the aim of automating this process. In this research we propose to use patent citation information, especially the citation network structure information, to address the patent classification problem. We adopt a kernel-based approach and design kernel functions to capture content information and various citation-related information in patents. These kernels. performances are evaluated on a testbed of patents related to nanotechnology. Evaluation results show that our proposed labeled citation graph kernel, which utilized citation network structures, significantly outperforms the kernels that use no citation information or only direct citation information.

References

[1]
Amsler, R. Application of citation-based automatic classication, University of Texas at Austin, Linguistics Research Center, Austin, TX, 1972.
[2]
Borgwardt, K. M., Ong, C. S., Schonauer, S., Vishwanathan, S. V. N., Smola, A. J. and Kriegel, H. P. Protein function prediction via graph kernels. Bioinformatics, 21. 2005, I47--I56.
[3]
Brin, S. and Page, L. The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems, 30 (1-7). 1998, 107--117.
[4]
Calado, P., Cristo, M., Goncalves, M. A., de Moura, E. S., Ribeiro-Neto, B. and Ziviani, N. Link-based similarity measures for the classification of Web documents. Journal of the American Society for Information Science and Technology, 57 (2). 2006, 208--221.
[5]
Chakrabarti, S., Dom, B. and Indyk, P., Enhanced hypertext categorization using hyperlinks. In SIGMOD, (Seattle, WA, 1998), 307--318.
[6]
Chang, C. -C. and Lin, C. -J. LIBSVM: a library for support vector machines. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm. 2001.
[7]
Fall, C. J., Torcsvari, A., Benzineb, K. and Karetka, G., Automated categorization in the International patent classification. In SIGIR Forum, (2003), 10--25.
[8]
Fall, C. J., Torcsvari, A., Fievet, P. and Karetka, G. Automated categorization of German-language patent documents. Expert Systems with Applications, 26 (2). 2004, 269--277.
[9]
Gartner, T. A survey of kernels for structured data. ACM SIGKDD Explorations Newsletter, 5 (1). 2003, 49--58.
[10]
Haussler, D. Convolution kernels on discrete structures Technical Report UCS-CRL-99-10, UC Santa Cruz, 1999.
[11]
Huang, Z., Chen, H., Chen, Z. -K. and Roco, M. C. International nanotechnology development in 2003: Country, institution, and technology field analysis based on USPTO patent database. Journal of Nanoparticale Research, 6 (4). 2004, 325--354.
[12]
Huang, Z., Chen, H., Yan, L. and Roco, M. Longitudinal nanotechnology development (1991-2002): National Science Foundation funding and its impact on patents. Journal of Nanoparticle Research, 7. 2005, 343--376.
[13]
Huang, Z., Chen, H., Yip, A., Ng, G., Guo, F., Chen, Z. -K. and Roco, M. C. Longitudinal patent analysis for Nanoscale Science and Engineering: Country, institution and technology field. Journal of Nanoparticale Research, 5. 2003, 333--363.
[14]
Hull, D., Ait-Mokhtar, S., Chuat, M., Eisele, A., Gaussier, E., Grefenstette, G., Isabelle, P., Samuelsson, C. and Segond, F. Language technologies and patent search and classification. World Patent Information, 21 (3). 2001, 265--268.
[15]
Joachims, T., Cristianini, N. and Shawe-Taylor, J., Composite kernels for hypertext categorisation. In Proceedings of ICML-01, 18th International Conference on Machine Learning, (2001), 250--257.
[16]
Kessler, M. M. Bibliographic Coupling between Scientific Papers. American Documentation, 14 (1). 1963, 10--&.
[17]
Kleinberg, J. M. Authoritative sources in a hyperlinked environment. Journal of the ACM, 46 (5). 1999, 604--632.
[18]
Koster, C. H. A., Seutter, M. and Beney, J., Classifying patent applications with winnow. In Proceedings of Benelearn 2001 conference, (2001).
[19]
Krier, M. and Zacca, F. Automatic categorisation applications at the European patent office. World Patent Information, 24 (3). 2002, 187--196.
[20]
Larkey, L. S., A patent search and classification system. In DL-99, 4th ACM Conference on Digital Libraries, (Berkeley, CA, 1999), 79--87.
[21]
Loh, H. T., He, C. and Shen, L. Automatic classification of patent documents for TRIZ users. World Patent Information, 28 (1). 2006, 6--13.
[22]
McCallum, A. K. Bow: A toolkit for statistical language modeling, text retrieval, classification and clustering. http://www.cs.cmu.edu/~mccallum/bow. 1996.
[23]
Muller, K. R., Mika, S., Ratsch, G., Tsuda, K. and Scholkopf, B. An introduction to kernel-based learning algorithms. IEEE Transactions on Neural Networks, 12 (2). 2001, 181--201.
[24]
Oh, H. -J., Myaeng, S. H. and Lee, M. -H., A practical hypertext categorization method using links and incrementally available class information. In Proceedings of SIGIR-00, 23rd ACM International Conference on Research and Development in Information Retrieval, (2000), 264--271.
[25]
Richter, G. and MacFarlane, A. The impact of metadata on the accuracy of automated patent classification. World Patent Information, 27 (1). 2005, 13--26.
[26]
Sebastiani, F. Machine learning in automated text categorization. ACM Computing Surveys, 34 (1). 2002, 1--47.
[27]
Small, H. Co-Citation in Scientific Literature - New Measure of Relationship between 2 Documents. Current Contents (7). 1974, 7--10.
[28]
Smith, H. Automation of patent classification. World Patent Information, 24 (4). 2002, 269--271.
[29]
Tan, Y. and Wang, J. A support vector machine with a hybrid kernel and minimal Vapnik-Chervonenkis dimension. IEEE Transactions on Knowledge and Data Engineering, 16 (4). 2004, 385--395.
[30]
Teichert, T. and Mittermayer, M. -A., Text mining for technology mointoring. In IEEE IEMC 2002 - International Engineering Management Conference, (2002), 596--601.
[31]
USPTO U.S. Patent statistics chart calendar years 1963-2005. http://www.uspto.gov/web/offices/ac/ido/oeip/taf/us_stat.htm. 2005.
[32]
Yang, Y. An evaluation of statistical approaches to text categorization. Information Retrieval, 1. 1999, 69--90.
[33]
Yang, Y. M., Slattery, S. and Ghani, R. A study of approaches to hypertext categorization. Journal of Intelligent Information Systems, 18 (2-3). 2002, 219--241.

Cited By

View all
  • (2024)Adaptive Taxonomy Learning and Historical Patterns Modeling for Patent ClassificationACM Transactions on Information Systems10.1145/367483442:6(1-24)Online publication date: 18-Oct-2024
  • (2023)Classification of Research Papers on Radio Frequency Electromagnetic Field (RF-EMF) Using Graph Neural Networks (GNN)Applied Sciences10.3390/app1307461413:7(4614)Online publication date: 5-Apr-2023
  • (2023)Japanese Patent Classification Using Few-shot Learning2023 14th IIAI International Congress on Advanced Applied Informatics (IIAI-AAI)10.1109/IIAI-AAI59060.2023.00032(118-121)Online publication date: 8-Jul-2023
  • Show More Cited By

Index Terms

  1. Automatic patent classification using citation network information: an experimental study in nanotechnology

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    JCDL '07: Proceedings of the 7th ACM/IEEE-CS joint conference on Digital libraries
    June 2007
    534 pages
    ISBN:9781595936448
    DOI:10.1145/1255175
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 18 June 2007

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. citation network
    2. graph kernel
    3. kernel-based method
    4. machine learning
    5. nanotechnology
    6. patent classification

    Qualifiers

    • Article

    Conference

    JCDL07
    JCDL07: Joint Conference on Digital Libraries
    June 18 - 23, 2007
    BC, Vancouver, Canada

    Acceptance Rates

    Overall Acceptance Rate 415 of 1,482 submissions, 28%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)8
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 03 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Adaptive Taxonomy Learning and Historical Patterns Modeling for Patent ClassificationACM Transactions on Information Systems10.1145/367483442:6(1-24)Online publication date: 18-Oct-2024
    • (2023)Classification of Research Papers on Radio Frequency Electromagnetic Field (RF-EMF) Using Graph Neural Networks (GNN)Applied Sciences10.3390/app1307461413:7(4614)Online publication date: 5-Apr-2023
    • (2023)Japanese Patent Classification Using Few-shot Learning2023 14th IIAI International Congress on Advanced Applied Informatics (IIAI-AAI)10.1109/IIAI-AAI59060.2023.00032(118-121)Online publication date: 8-Jul-2023
    • (2020)Parameter tuning Naïve Bayes for automatic patent classificationWorld Patent Information10.1016/j.wpi.2020.10196861(101968)Online publication date: Jun-2020
    • (2018)Automated patent landscapingArtificial Intelligence and Law10.1007/s10506-018-9222-426:2(103-125)Online publication date: 1-Jun-2018
    • (2017)Future Patent SearchCurrent Challenges in Patent Information Retrieval10.1007/978-3-662-53817-3_17(433-455)Online publication date: 26-Mar-2017
    • (2017)Supervised Approaches to Assign Cooperative Patent Classification (CPC) Codes to PatentsMining Intelligence and Knowledge Exploration10.1007/978-3-319-71928-3_3(22-34)Online publication date: 28-Nov-2017
    • (2017)Application of k-Step Random Walk Paths to Graph Kernel for Automatic Patent ClassificationDigital Libraries: Data, Information, and Knowledge for Digital Lives10.1007/978-3-319-70232-2_2(14-29)Online publication date: 3-Nov-2017
    • (2014)Complex Network Analysis of Research Funding: A Case Study of NSF GrantsState of the Art Applications of Social Network Analysis10.1007/978-3-319-05912-9_8(163-187)Online publication date: 15-May-2014
    • (2013)Discovering and assessing fields of expertise in nanomedicineScientometrics10.1007/s11192-012-0891-694:3(1111-1136)Online publication date: 1-Mar-2013
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media