A patent quality analysis and classification system using self-organizing maps with support vector machine
Graphical abstract
The framework of SOM-KPCA-SVM patent quality classification system.
Introduction
In recent years, more and more investing companies are eager to know which patent to invest on for future products, since hot product technology are playing an important role in making profit. However, the number of patents filed every year is incrementing at an expeditious speed. Thus the patents examined and approved in patent systems have a crucial role in the industry over different countries, and the research on patent's quality is getting more attention from the academic researchers and industrial practitioners. Patent quality analysis provides an expedient way through which companies determine whether or not to modify and continue manufacturing innovative products, but the question of how to evaluate and predict the quality or value of a new patent presents a new challenge to the researchers and industrial practitioners. Nevertheless, the current technology in patent quality evaluation and classification is still in its primitive stage, and the patents’ qualities are still evaluated manually by the experts, leading to inaccuracies of all types.
Currently, there are various tools that are being utilized by organizations for analyzing patents. However, an important issue of patent analysis is patent quality analysis. The high-quality patent information can ensure success for business decision-making process or product development [1], [2]. This study reviewed the patent analysis approaches that can understand patent status like patent quality, novelty, litigation, trends and so on [3]. However, traditional patent analysis requires spending much time, cost and manpower. The potential patents for high-quality determining approach need to have shortened analysis at times. In general, the analysis approaches are statistical analysis or indicators computation. Recently, the clustering method is widely applied to cluster patents according to patent characteristics for patent trend [4]. The methods with statistical analysis can help analysts to understand patent situation or trend of this time, but if we want to know the potential quality of a newly applied patent, it does not provide effective rules or solutions to determination. The future patent evaluation is a key issue when a new patent is applied or published because patent has been producing impact on the industry according to the past industrial development such as patent litigation, specifically high-tech or information.
The patent officers approve a large amount of patents each year and current patent systems face a serious problem of evaluating these patents’ qualities. Traditional researchers and analyzers have focused on developing various patent quality indicators. The patent indicators are collected from patent corpuses, including the number of patent citations and the number of International Patent Classifications (IPC). The primary patent quality indicators [5], [6], [7] are related to investment, maintenance, and litigation, which form a basis for assessing patent. But, these indicators do not have further predicting power on a new patent application or publication. Therefore, the data mining (DM) approaches are employed in this article to identify and classify the new patent's quality. Investors from venture capital companies can derive these patents’ qualities in time when making decisions regarding the development of new innovative products and discovering the trend of state-of-the-art technology. Thus, an automatic classification system to analyze and forecast the patent quality is needed in order to quickly respond important or emergency situations.
In this study, we propose an automatic patent quality classification system named SOM-KPCA-SVM which combines three DM methods including self-organizing maps (SOM), kernel principal component analysis (KPCA) and support vector machine (SVM). The SOM is a two-dimensional (or multidimensional) network structure for multiple variables mapping and cluster sample to several groups [8]. Therefore, SOM in this article is used to cluster patents into several quality groups according to patent quality indicators. We will summarize these quality indicators in order to delimit different quality for each group. The result of quality analysis on patent data extends to a classification problem in order to early identify valuable patent as well as patent quality forecasting. In addition, the KPCA is based on kernel mapping with principal component analysis (PCA). KPCA is used to transform original feature space into a new nonlinear feature space through nonlinear kernel mapping and relationships among new feature that are independent variables [9]. Thus, we will apply KPCA to extract key characteristics of a patent from the patent document. In the classification problems, the SVM classification approach is a powerful tool for solving many kinds of problems such as stock trends [10] and patent classification [11]. In this article, the SVM is used to build patent quality classification model and it can automatically determine the patent quality and hence there is no need to hire an expert to rank or define patent quality. Therefore, a SOM-KPCA-SVM system can automatically analyze patent quality based on past patent applications and forecast an unknown patent's quality, better enabling engineers and product designers to forecast a patent's potential for product development or innovation.
The inventors, attorneys, examiners, governments and companies need to reach a consensus on the quality of patents. Though the value can be estimated manually or by experts studying about the actual quality decisions, this is slow and expensive. In this study, we introduce an automatic analysis and classification system of patent quality named SOM-KPCA-SVM, which represents the quality in which the application will be classified. The main contributions of this work are summarized as follows:
- •
The quality type is identified based on the clustering approach by self-organizing maps.
- •
New feature sets of nonlinear space are transformed by KPCA for improving the quality classification of patent applications.
- •
Quality classification system to classify the patent quality is obtained by using a support vector machine.
- •
The classification effectiveness of the model is shown by an evaluation using more than 18,000 applications around the world related to thin film solar cell.
- •
A new patent analysis system allows users to automatically analyze the quality of patent applications.
- •
A group of intellectual property experts who work with solar cell hi-tech companies obtained analysis results by classifying patent applications on the quality classification system.
The overall structure of this article is as follows. Literature review is given in Section 2. The proposed patent quality analysis and classification system by SOM, KPCA and SVM is introduced in Section 3. Following the applications and results, discussions of the proposed system are shown in Section 4. Conclusions and future research directions are finally presented in Section 5.
Section snippets
Literature reviews
In this section, first we summarize the studies related to the patent analysis. Second, we focus the patent quality indicators that are used in this study for calculating patent quality. Third, literatures in SOM are reviewed for patent clustering analysis. Fourth, the concepts of feature extraction of KPCA are reviewed for capturing key features. Finally, we review literature in SVM for developing the patent classification system.
Proposed method
This study proposed an automatic patent quality classification system that integrated a system combining three approaches including self-organizing maps, kernel principal component analysis and support vector machine, namely SOM-KPCA-SVM. This quality classification system has two stages to implement: stage one is patent analysis and quality definition, and stage two is a patent quality classification model building as shown in Fig. 1. In stage one, we collect the patent data related specific
Experimental results
In this section, we have designed a series of testing for evaluating our proposed methodology as SOM-KPCA-SVM. There are three parameters for experiments, first one, the scale of data on time has three different period datasets which are five years, ten years and forty years; second one, the amount of quality groups has three direction which are three quality groups, five quality groups and seven quality groups; finally, the number of feature extraction has four percentages which are 25%, 50%,
Conclusions
This study proposed three data mining approaches to patent analysis and patent quality forecasting. The SOM-KPCA-SVM patent quality system combined self-organizing maps, kernel principal component analysis and support vector machine to classify patent quality of a thin film solar cell in solar industry. The SOM has successful cluster patent into different quality groups and its result has statistically significant difference in the quality indicators between the quality groups. The KPCA has
References (35)
- et al.
A patent quality analysis for innovative technology and product development
Adv. Eng. Informatics
(2012) - et al.
Intelligent patent recommendation system for Innovative design collaboration
J. Network Comp. Appl.
(2013) - et al.
A literature review on the state-of-the-art in patent analysis
World Patent Info.
(2014) - et al.
Classifying technology patents to identify trends: Applying a fuzzy-based clustering approach in the Turkish textile industry
Technol. Soc.
(2009) - et al.
The increasing linkage between US technology and public science
Res. Policy
(1997) - et al.
Patent analysis-based fuzzy inference system for technological strategy planning
Automat. Constr.
(2009) - et al.
Towards content-oriented patent document processing: Intelligent patent analysis and summarization
World Patent Info.
(2015) - et al.
The patent portfolio value analysis: A new framework to leverage patent information for strategic technology planning
Technol. Forecast. Soc. Change
(2015) The value of U.S. patents by owner and patent characteristics
Res. Policy
(2008)Prior art search tools on the Internet and legal status of the results: a European Patent Office perspective
World Patent Info.
(2004)
Of submarines and interference: legal status changes following citation of an earlier US patent or patent application under 35 USC §102 (e)
World Patent Info.
Valuation effects of patent quality: A comparison for Japanese and U.S. firms
Pacific-Basin Fin. J.
Patent statistics: A good indicator for innovation in China? Patent subsidy program impacts on patent quality
China Econ. Rev.
The relationship between a firm's patent quality and its market value-The case of US pharmaceutical industry
Technol. Forecast. Soc. Change
Patent indicators for macroeconomic growth-the value of patents estimated by export volume
Techonovation
Citations, family size, opposition and the value of patent rights
Res. Policy
On-line pattern analysis by evolving self-organizing maps
Neurocomputing
Cited by (83)
Research on patent quality evaluation based on rough set and cloud model
2024, Expert Systems with ApplicationsWhich type of dynamic indicators should be preferred to predict patent commercial potential?
2023, Technological Forecasting and Social ChangeDeep learning for predicting patent application outcome: The fusion of text and network embeddings
2023, Journal of InformetricsDeep learning for patent landscaping using transformer and graph embedding
2022, Technological Forecasting and Social ChangeA collaborative evaluation method of the quality of patent scientific and technological resources
2021, World Patent InformationCitation Excerpt :Therefore, by combing prior research and our understanding of patent quality, we propose a new index system from three dimensions of patent value, i.e., technical value, legal value and market value. As for patent evaluation methods, there are generally three categories, namely economic methods, comprehensive evaluation methods and artificial intelligence methods [24,25]. Economic methods can be further divided into cost methods, market methods and income methods [8].