Elsevier

Applied Soft Computing

Volume 41, April 2016, Pages 305-316
Applied Soft Computing

A patent quality analysis and classification system using self-organizing maps with support vector machine

https://doi.org/10.1016/j.asoc.2016.01.020Get rights and content

Highlights

  • An automatic patent quality analysis and classification system is developed.

  • The self-organizing map approach is used to cluster patents published before into different quality groups.

  • The kernel principal component analysis is used to transform nonlinear feature space to improve classification performance.

  • The support vector machine is used to build up the patent quality classification model.

  • A series of experiments for patent data of the thin film solar cell are conducted, and the results are very encouraging.

Abstract

A plethora of patents are approved by the patent officers each year and current patent systems face a solemn quandary of evaluating these patents’ qualities. Traditional researchers and analyzers have fixated on developing sundry patent quality indicators only, but these indicators do not have further prognosticating power on incipient patent applications or publications. Therefore, the data mining (DM) approaches are employed in this article to identify and to classify the new patent's quality in time. An automatic patent quality analysis and classification system, namely SOM-KPCA-SVM, is developed according to patent quality indicators and characteristics, respectively. First, the self-organizing map (SOM) approach is used to cluster patents published before into different quality groups according to the patent quality indicators and defines group quality type instead of via experts. The kernel principal component analysis (KPCA) approach is used to transform nonlinear feature space in order to improve classification performance. Finally, the support vector machine (SVM) is used to build up the patent quality classification model. The proposed SOM-KPCA-SVM is applied to classify patent quality automatically in patent data of the thin film solar cell. Experimental results show that our proposed system can capture the analysis effectively compared with traditional manpower approach.

Graphical abstract

The framework of SOM-KPCA-SVM patent quality classification system.

  1. Download : Download high-res image (115KB)
  2. Download : Download full-size image

Introduction

In recent years, more and more investing companies are eager to know which patent to invest on for future products, since hot product technology are playing an important role in making profit. However, the number of patents filed every year is incrementing at an expeditious speed. Thus the patents examined and approved in patent systems have a crucial role in the industry over different countries, and the research on patent's quality is getting more attention from the academic researchers and industrial practitioners. Patent quality analysis provides an expedient way through which companies determine whether or not to modify and continue manufacturing innovative products, but the question of how to evaluate and predict the quality or value of a new patent presents a new challenge to the researchers and industrial practitioners. Nevertheless, the current technology in patent quality evaluation and classification is still in its primitive stage, and the patents’ qualities are still evaluated manually by the experts, leading to inaccuracies of all types.

Currently, there are various tools that are being utilized by organizations for analyzing patents. However, an important issue of patent analysis is patent quality analysis. The high-quality patent information can ensure success for business decision-making process or product development [1], [2]. This study reviewed the patent analysis approaches that can understand patent status like patent quality, novelty, litigation, trends and so on [3]. However, traditional patent analysis requires spending much time, cost and manpower. The potential patents for high-quality determining approach need to have shortened analysis at times. In general, the analysis approaches are statistical analysis or indicators computation. Recently, the clustering method is widely applied to cluster patents according to patent characteristics for patent trend [4]. The methods with statistical analysis can help analysts to understand patent situation or trend of this time, but if we want to know the potential quality of a newly applied patent, it does not provide effective rules or solutions to determination. The future patent evaluation is a key issue when a new patent is applied or published because patent has been producing impact on the industry according to the past industrial development such as patent litigation, specifically high-tech or information.

The patent officers approve a large amount of patents each year and current patent systems face a serious problem of evaluating these patents’ qualities. Traditional researchers and analyzers have focused on developing various patent quality indicators. The patent indicators are collected from patent corpuses, including the number of patent citations and the number of International Patent Classifications (IPC). The primary patent quality indicators [5], [6], [7] are related to investment, maintenance, and litigation, which form a basis for assessing patent. But, these indicators do not have further predicting power on a new patent application or publication. Therefore, the data mining (DM) approaches are employed in this article to identify and classify the new patent's quality. Investors from venture capital companies can derive these patents’ qualities in time when making decisions regarding the development of new innovative products and discovering the trend of state-of-the-art technology. Thus, an automatic classification system to analyze and forecast the patent quality is needed in order to quickly respond important or emergency situations.

In this study, we propose an automatic patent quality classification system named SOM-KPCA-SVM which combines three DM methods including self-organizing maps (SOM), kernel principal component analysis (KPCA) and support vector machine (SVM). The SOM is a two-dimensional (or multidimensional) network structure for multiple variables mapping and cluster sample to several groups [8]. Therefore, SOM in this article is used to cluster patents into several quality groups according to patent quality indicators. We will summarize these quality indicators in order to delimit different quality for each group. The result of quality analysis on patent data extends to a classification problem in order to early identify valuable patent as well as patent quality forecasting. In addition, the KPCA is based on kernel mapping with principal component analysis (PCA). KPCA is used to transform original feature space into a new nonlinear feature space through nonlinear kernel mapping and relationships among new feature that are independent variables [9]. Thus, we will apply KPCA to extract key characteristics of a patent from the patent document. In the classification problems, the SVM classification approach is a powerful tool for solving many kinds of problems such as stock trends [10] and patent classification [11]. In this article, the SVM is used to build patent quality classification model and it can automatically determine the patent quality and hence there is no need to hire an expert to rank or define patent quality. Therefore, a SOM-KPCA-SVM system can automatically analyze patent quality based on past patent applications and forecast an unknown patent's quality, better enabling engineers and product designers to forecast a patent's potential for product development or innovation.

The inventors, attorneys, examiners, governments and companies need to reach a consensus on the quality of patents. Though the value can be estimated manually or by experts studying about the actual quality decisions, this is slow and expensive. In this study, we introduce an automatic analysis and classification system of patent quality named SOM-KPCA-SVM, which represents the quality in which the application will be classified. The main contributions of this work are summarized as follows:

  • The quality type is identified based on the clustering approach by self-organizing maps.

  • New feature sets of nonlinear space are transformed by KPCA for improving the quality classification of patent applications.

  • Quality classification system to classify the patent quality is obtained by using a support vector machine.

  • The classification effectiveness of the model is shown by an evaluation using more than 18,000 applications around the world related to thin film solar cell.

  • A new patent analysis system allows users to automatically analyze the quality of patent applications.

  • A group of intellectual property experts who work with solar cell hi-tech companies obtained analysis results by classifying patent applications on the quality classification system.

The overall structure of this article is as follows. Literature review is given in Section 2. The proposed patent quality analysis and classification system by SOM, KPCA and SVM is introduced in Section 3. Following the applications and results, discussions of the proposed system are shown in Section 4. Conclusions and future research directions are finally presented in Section 5.

Section snippets

Literature reviews

In this section, first we summarize the studies related to the patent analysis. Second, we focus the patent quality indicators that are used in this study for calculating patent quality. Third, literatures in SOM are reviewed for patent clustering analysis. Fourth, the concepts of feature extraction of KPCA are reviewed for capturing key features. Finally, we review literature in SVM for developing the patent classification system.

Proposed method

This study proposed an automatic patent quality classification system that integrated a system combining three approaches including self-organizing maps, kernel principal component analysis and support vector machine, namely SOM-KPCA-SVM. This quality classification system has two stages to implement: stage one is patent analysis and quality definition, and stage two is a patent quality classification model building as shown in Fig. 1. In stage one, we collect the patent data related specific

Experimental results

In this section, we have designed a series of testing for evaluating our proposed methodology as SOM-KPCA-SVM. There are three parameters for experiments, first one, the scale of data on time has three different period datasets which are five years, ten years and forty years; second one, the amount of quality groups has three direction which are three quality groups, five quality groups and seven quality groups; finally, the number of feature extraction has four percentages which are 25%, 50%,

Conclusions

This study proposed three data mining approaches to patent analysis and patent quality forecasting. The SOM-KPCA-SVM patent quality system combined self-organizing maps, kernel principal component analysis and support vector machine to classify patent quality of a thin film solar cell in solar industry. The SOM has successful cluster patent into different quality groups and its result has statistically significant difference in the quality indicators between the quality groups. The KPCA has

References (35)

Cited by (83)

  • A collaborative evaluation method of the quality of patent scientific and technological resources

    2021, World Patent Information
    Citation Excerpt :

    Therefore, by combing prior research and our understanding of patent quality, we propose a new index system from three dimensions of patent value, i.e., technical value, legal value and market value. As for patent evaluation methods, there are generally three categories, namely economic methods, comprehensive evaluation methods and artificial intelligence methods [24,25]. Economic methods can be further divided into cost methods, market methods and income methods [8].

View all citing articles on Scopus
View full text