Elsevier

Expert Systems with Applications

Volume 42, Issue 3, 15 February 2015, Pages 1479-1486
Expert Systems with Applications

Graph kernel based measure for evaluating the influence of patents in a patent citation network

https://doi.org/10.1016/j.eswa.2014.08.051Get rights and content

Highlights

  • A new kernel based influence measure for evaluating patent influence is proposed.

  • We use the difference in kernel matrix norms as a measure of node influence.

  • Node with largest difference in matrix norm is considered as most influential node.

  • Von Neumann kernel can be used to account for both direct and indirect citations.

  • Experiments show that our proposed approach performs better than existing measures.

Abstract

Identifying important patents helps to drive business growth and focus investment. In the past, centrality measures such as degree centrality and betweenness centrality have been applied to identify influential or important patents in patent citation networks. How such a complex process like technological change can be analyzed is an important research topic. However, no existing centrality measure leverages the powerful graph kernels for this end. This paper presents a new centrality measure based on the change of the node similarity matrix after leveraging graph kernels. The proposed approach provides a more robust understanding of the identification of influential nodes, since it focuses on graph structure information by considering direct and indirect patent citations. This study begins with the premise that the change of similarity matrix that results from removing a given node indicates the importance of the node within its network, since each node makes a contribution to the similarity matrix among nodes. We calculate the change of the similarity matrix norms for a given node after we calculate the singular values for the case of the existence and the case of nonexistence of that node within the network. Then, the node resulting in the largest change (i.e., decrease) in the similarity matrix norm is considered to be the most influential node. We compare the performance of our proposed approach with other widely-used centrality measures using artificial data and real-life U.S. patent data. Experimental results show that our proposed approach performs better than existing methods.

Introduction

In the creation of a new patent, it is typical for the new patent to refer to one or more previous patents in a bibliography. These citations highlight information that may be useful to the reader, explain how the current work relates to prior work, and indicate influences on the current work (Michel and Bettels, 2001, Newman, 2010). Patent citation data has long been known to be a source of information on technology innovation. Understanding technological evolution is vital for business and drives growth, and an increasing number of decision makers use patent citation analysis as a tool to survey and understand the activities of their competitors (Kim et al., 2014, Kim and Seol, 2012, Sood and Tellis, 2005).

Citing a patent implies that the contents of the cited patent are relevant to those of the citing patent in some way. Extending this idea then, patent citation networks explain the relationship among some set of patents that cite each other, where patents are the nodes of the network and an edge exists between the two nodes if one patent cites the other. Citation networks have the distinguishing characteristic of being acyclic, meaning that there are no closed loops of directed edges in the network (Newman, 2010). This characteristic results from all directed edges (citations) going from an older patent to a newer patent, and never in the other direction. This type of network is different from networks such as the World Wide Web (WWW) and social networks, in which cycles in the networks are common.

In the general citation network, there are two kinds of nodes of particular interest: authorities and hubs. Authorities are nodes that contain especially useful information on a topic of interest. Hubs, such as review papers, are nodes that tell where the best authorities can be found (Kleinberg, 1999, Newman, 2010). Kleinberg proposed a hyperlink structural analysis algorithm to determine authorizes and hubs in the World Wide Web (Kleinberg, 1999). Discovering authoritative sources in the WWW is similar to finding these important nodes in a patent citation network. Based only on the structural analysis of the patent citation network, we aim to score and rank nodes in importance in order to identify the most influential patents.

A great deal of research has been conducted with the goal of detecting influential or important nodes in a variety of networks (Borgatti, 2005, Borgatti, 2006, Freeman, 1978, Narayanam and Narahari, 2011, Opsahl et al., 2010, Rousseau, 1987). For this objective, a variety of importance measures, called centrality measures, have been developed. Degree centrality is defined as the number of edges incident to a node (Newman, 2010). Degree centrality can be made more specific to consider out-degree and in-degree centrality by counting the number of directed edges that are directed out from a node or are directed into a node, respectively. Another centrality measure is called closeness. In connected graphs there is a natural distance metric between all pairs of nodes, given by the length of the node-pair’s shortest path. The farness of a node i is defined as the sum of its shortest path distances to all other nodes, and its closeness is defined as the inverse of the farness. The random walk closeness centrality (Noh & Rieger, 2004) measures the speed with which a randomly walking message reaches a node from elsewhere in the network, thus resulting in a random-walk version of closeness centrality. Kwon et al. (2009) propose the weighted reachability (WR) measure, which is applied specifically to directed citation networks. The main idea of the WR measure is to consider both adjacent nodes (direct citations) and non-adjacent nodes (indirect citations). In this measure, direct citations are given a greater weight than indirect citations, where indirect citations are weighted inversely proportional to the length of the path between two nodes. Most existing centrality measures do not consider the non-adjacent nodes (indirect citations). Although some centrality measures, such as WR, consider the indirect citations, their weighting system is not very robust.

This paper proposes a new centrality measure focused on directed patent citation networks with unweighted edges between pairs of nodes. We do so in two new ways: (1) applying various graph kernels, which have not yet been applied to patent citation analysis (2) leveraging the direction of citations. We are able to quantify the importance of a node using the patent citation information, specifically the patent citation network structure. The idea is to weight the adjacency matrix and the higher orders (i.e., powers) of the adjacency matrix so that we capture direct and indirect citations with a varying and flexible weighting scheme, providing a centrality measure that is more robust than any existing measure. This paper works on the assumption that the change of the similarity matrix that results from the removing of a particular node reflects the importance of that node to the network to which it belongs. We assume this relationship since each node contributes to the similarity matrix of the network, when it is included in the network. Combined with the matrix norms, which are a measure of the size of a matrix based on singular values, the proposed measure computes the change of similarity matrix that results from removing a node. The largest change in the matrix norm identifies the most influential node in the network.

Our proposed centrality measure considers both paths of adjacent nodes and the nodes that are reachable, but not adjacent, as opposed to many other centrality measures that only consider nodes that are adjacent. Furthermore, our procedure allows for robust scoring and ranking of nodes in importance in order to identify influential nodes of a directed citation network so that the key technology areas are clearly identified. To evaluate the quality of the ranking produced by the proposed centrality measure applied to patent citation networks, we compare our results to out-degree centrality and original singular values-based centrality (SVC) (Kim et al., 2012) using artificial patent citation data and real-life patent citation data. Ultimately, we show an improvement to the original SVC.

The remainder of the paper is organized as follows. First, Section 2 gives an overview on various graph kernels and matrix norms. Section 3 presents the details of the proposed centrality measure, which leverages graph kernels. Section 4 presents the computational results obtained using artificial patent citation networks. Section 5 provides a case study on a real-life patent citation network dataset. Finally, Section 6 presents the conclusions reached as a result of the experimental results.

Section snippets

Background

In this section we provide some background on graph kernels and matrix norms.

Introduction of GKB-SVC

The motivation of graph kernel-based singular values-based centrality (GKB-SVC) is to build on and improve the singular values-based centrality (SVC) presented in Kim et al. (2012). In doing so, we generalize the centrality measure by allowing for weighting of indirect citations of different lengths, thus making a more robust measure in order to better identify influential patents (i.e., the core technologies) using the graph similarity matrix to explain the relationship of the nodes (patents),

Experimental results

In this section we compare our proposed GKB-SVC measure with existing centrality measures using artificial datasets. The centrality measures considered for comparison purposes are out-degree centrality, original SVC, and GKB-SVC.

Case study

In this section, the relative performance of the existing centrality measures and the proposed methods are compared using the Coefficient of Variation (CV), which is a method to evaluate the discrimination ability of the centrality measures (Kim et al., 2012). The centrality measures considered for comparison in this case study are out-degree centrality, original SVC, and the graph kernel based-SVC proposed in this paper.

The original pool of patents used for the computational experiments

Conclusion

In this paper, we presented a graph kernel based method for ranking patents in influence given a patent citation network. Specifically we proposed to use the von Neumann graph kernel to weigh both the direct and the indirect citations that a patent receives from later patents, in order to evaluate patent influence. The presented methods were specifically developed to be applied to patent citation networks, but may also be applied to literature citation networks, where there is a natural sense

References (19)

  • S.P. Borgatti

    Centrality and network flow

    Social Networks

    (2005)
  • L.C. Freeman

    Centrality in social network: Conceptual clarification

    Social Networks

    (1978)
  • T. Opsahl et al.

    Node centrality in weighted networks: Generalizing degree and shortest paths

    Social Networks

    (2010)
  • S.P. Borgatti

    Identifying sets of key players in a social network

    Computational and Mathematical Organizational Theory

    (2006)
  • Brand, M. (2003). Fast online svd revisions for lightweight recommender systems. In SIAM third international conference...
  • Fouss, F., Yen, L., Pirotte, A., & Saerens, M. (2006). An experimental investigation of graph kernels on a...
  • J. Kandola et al.

    Learning semantic similarity

  • B. Kim et al.

    Inter-cluster connectivity analysis for technology opportunity discovery

    Scientometrics

    (2014)
  • C. Kim et al.

    On a patent analysis method for identifying core technologies

There are more references available in the full text version of this article.

Cited by (22)

  • A synthetical analysis method of measuring technology convergence

    2022, Expert Systems with Applications
    Citation Excerpt :

    Ko, Yoon, and Seo (2014) constructed a technological knowledge flow matrix based on patent citation analysis to describe knowledge flow information and the interdisciplinary evolution trends of technologies. Rodriguez, Kim, Lee, Coh, and Jeong (2015) described a node-similarity matrix change method based on patent citation networks to analyze trends in technological change and to identify core patents. Lee, Kim, Cho, and Park (2009) constructed a technology network based on patent citation data that takes into account direct and indirect effects between technologies to identify core technologies.

  • Graph convolutional networks for enhanced resolution 3D Electrical Capacitance Tomography image reconstruction

    2021, Applied Soft Computing
    Citation Excerpt :

    However, there exist numerous data that lay on irregular or non-Euclidean domains. Examples are pairwise relationships in various networks, including citation networks [37,38], social networks [39] or transportation networks [40]. Irregular graphs may also encode geometric data and structure of genes, proteins [41], chemical compounds [42] or complex 3D shapes [43,44].

  • Identifying influential energy stocks based on spillover network

    2020, International Review of Financial Analysis
    Citation Excerpt :

    Many indexes have been proposed to elucidate the special locations of certain nodes that could be used as indicators of the influences of nodes (Chen, Lü, Shang, Zhang, & Zhou, 2012; Lawyer, 2015). These indexes have been successfully used to identify influential opinion leaders in social networks (Zhao, Li, & Jin, 2016), influential amino acids in protein networks (Bulashevska, Bulashevska, & Eils, 2010) and influential scientists in the citation network (Rodriguez, Kim, Lee, Coh, & Jeong, 2015), among other applications. In our research, we build a spillover network to capture the asymmetric spillover effects among stocks and to provide a system description of the intricate spillover correlations among stocks.

View all citing articles on Scopus
View full text