Elsevier

Journal of Informetrics

Volume 12, Issue 4, November 2018, Pages 1223-1231
Journal of Informetrics

Regular article
Patent citation spectroscopy (PCS): Online retrieval of landmark patents based on an algorithmic approach

https://doi.org/10.1016/j.joi.2018.10.002Get rights and content

Highlights

  • We provide the first demonstration that reference citation spectroscopy techniques can be successfully applied to discover landmark patents.

  • Enhanced Patent Citation Spectroscopy (PCS) methodology enables more targeted identification of influential prior art.

  • Introduction of interactive web-application to retrieve and implement PCS on patent references.

Abstract

One essential component in the construction of patent landscapes in biomedical research and development (R&D) is identifying the most seminal patents. Hitherto, the identification of seminal patents required subject matter experts within biomedical areas. In this article, we report an analytical method and tool, Patent Citation Spectroscopy (PCS), for the online identification of landmark patents in user-specified areas of biomedical innovation. Using USPTO data, PCS mines the cited references within large sets of patents at the internet and provides an estimate of the historically most impactful prior work. We show the efficacy of PCS in three case studies of biomedical innovation with clinical relevance: (1) RNA interference (RNAi), (2) cholesterol and (3) cloning. PCS mined and analyzed cited references related to patents on RNA interference and correctly identified the foundational patent of this technology, as independently reported by subject matter experts on RNAi intellectual property. Secondly, we apply PCS to a broad set of patents dealing with cholesterol – a case study chosen to reflect a more general, as opposed to expert, patent search query. PCS mined through cited references and identified the seminal patent as that for Lipitor, the groundbreaking medication for treating high cholesterol as well as the pair of patents underlying Repatha. The final case study, cloning, highlights some of the advantages conferred by the PCS methodology in identifying seminal patents. These cases suggest that PCS provides a useful method for identifying seminal patents in areas of biomedical innovation and therapeutics. The interactive tool is free-to-use at: http://www.leydesdorff.net/comins/pcs/index.html.

Introduction

Amongst the various components of a patent landscape, identifying seminal patents in an innovation area requires substantial investment from specialists(e.g., Schmidt, 2007). Hitherto, subject matter experts review a large corpus of patents and patent applications within their historical context to render a judgment of the most technologically important patents. This method is time-consuming, difficult to replicate, and predicated on the availability of subject matter experts (Cockburn, Kortum, & Stern, 2002) – and yet, there is a requirement for patent examiners and historians of science. Thus, there is a need for computer-assisted methods for uncovering insights about landmark patents in technology areas (Jensen & Murray, 2005; Konski & Spielthenner, 2009).

Clinical advances depend upon a sound understanding of biomedical research and development. The importance of maintaining situational awareness of biomedical R&D activities for businesses and policy-makers is best exemplified by the proliferation of patent landscapes produced by subject matter experts covering a wide range of topics (e.g., CRISPR: Egelie, Graff, Strand, & Johansen, 2016; Induced Pluripotent Stem Cells: Roberts et al., 2014; Bergman & Graff, 2007; Prenatal Testing: Agarwal, Sayres, Cho, Cook‐Deegan, & Chandrasekharan, 2013; Carbon Nanotubes: Harris & Bawa, 2007; Nanomedicine: Wagner, Dullaart, Bock, & Zweck, 2006; Gene Sequences: Jensen & Murray, 2005). Given the enormous growth in the number of annual patent applications filed (United States Patent & Trademark Office - Patent Technology Monitoring Team, 2016), particularly in the life and biomedical sciences (Moses et al., 2015; cf. Agarwal & Searls, 2009), there is increasing demand for patent landscapes across a panoply of technologies (e.g., Breitzman & Thomas, 2015; Jaffe & Trajtenberg, 2002).

In the final section of their comprehensive review of the literature about patent citations, Sharma and Tripathi (2017) conclude that citation analysis among patents is able to retrieve the patents and publications which play a vital role in the growth a technology. Because of the requirement of citing state-of-the-art literature and the possibility for the examiner to add further citation to an application (Alcácer, Gittelman, & Sampat, 2009), the citation field is controlled in patenting more rigorously than in the case of publishing scholarly literature where citation traditions vary with disciplinary backgrounds (de Solla Price, 1970). Patentometry accordingly has become a flourishing field during the last decades, in relation also to the possibility to retrieve non-patent literature references (Narin, Hamilton, & Olivastro, 1997) and thus to relate the different knowledge flows and sources. Sharma and Tripathi (2017, p. 40) also found 23 software patents filed at USPTO dealing with valuing patents economically through citation analysis.

The development towards integrating historiography with network analysis of citation patterns develops in parallel both among patents and scholarly publications (Leydesdorff, Bornmann, Comins, Marx, & Thor, 2016; Liu & Lu, 2012). Abbas, Zhang, and Khan (2014) provide a review of the computational strategies and software relevant for data mining patents. They note that visualization techniques play an important role because these can stimulate the use of patent information in business and other organizations. Tools should be developed which offer multiple suggestions for the reconstruction and for devising strategies (Rotolo, Rafols, Hopkins, & Leydesdorff, 2017).

In light of this growing need, we introduce an algorithmic approach and corresponding web-application for identifying landmark patents, a key component of patent landscapes, across user-specific biomedical areas. Our approach is data-oriented and historiographic, and hence based on descriptive statistics (Anderson, 2008). Different from case-study-based models of patent structures (e.g., Ma & Porter, 2015; Chang, Wu, & Leu, 2010), however, our method is generic: it can be used with any retrieval from USPTO based on keywords or more advanced search parameters (e.g., CPC subclasses; Leydesdorff, Alkemade, Heimeriks, & Hoekstra, 2015; in addition, for an application of PCS to photovoltaic patents, see Comins & Leydesdorff (2018). Unlike model-based approaches assuming, for example, evolutionary mechanisms (e.g., Breitzman & Thomas, 2015; Valverde, 2014), we access the data each round without theoretical assumptions other than retrieval and visualization optimizations that are based on the literature about Referenced Publication Year Spectroscopy (RPYS) developed for similar purposes using scientific literature (Marx & Bornmann, 2014; Thor, Marx, Leydesdorff, & Bornmann, 2016). Like RPYS, PCS assumes an accumulation of citations in the case of important discoveries and contributions (Kaplan, 1965). Patent Citation Spectroscopy (PCS) operates over the cited references of large sets of patents to determine the seminal prior works within a given field, as well as an openly available web-application for performing PCS.

In this contribution, we apply PCS to three areas of biomedical innovation: (1) RNA interference (RNAi), (2) cholesterol and (3) cloning. RNAi was selected to examine the efficacy of PCS for understanding the origins of an emerging technology with well-documented expert reviews to ground our findings (Schmidt, 2007). Cholesterol was selected to consider how PCS performs for searches on broad areas of biomedical innovation and clinical relevance that reflect the less sophisticated kind of searches conducted by users who are not library and information scientists or patent experts. Finally, our third case study was selected to reveal the advantages conferred by using the PCS algorithm to identify a seminal patent.

Section snippets

Methods

PCS can be performed over any set of US patent data that includes a list of referenced patents, or backward citations. Our routines utilize data from PatentsView, a data platform sponsored by the USPTO. PatentsView provides backward citation information for all US patents from 1976 through July 2016 via an Application Programming Interface (API). We leverage this API both in demonstrating the utility of PCS to identify seminal patents and in creating a tool that makes PCS available for

Results

To demonstrate the utility of this tool, we applied PCS to an area of biomedical innovation: RNA interference. We selected RNA interference (hereto RNAi) as a use case for two reasons: (1) RNAi represents a burgeoning domain of biomedical innovation with potentially therapeutic applications for the treatment of viral infections and cancer; and (2) the patent landscape of RNAi has been studied by subject matter experts, which allows us to compare the results of PCS with their conclusions.

In

Discussion

Identifying intellectual pathways within biomedical science and technology is an important component of patent landscapes required by businesses and policy-makers. The possibility of online identification of landmark patents by PCS supports the generation of data-driven patent landscapes. Using the PCS methodology and application described here, it may be easier for users to understand the fundamental patents of myriad biotechnologies as well as the companies (assignees) and people (inventors)

Author contributions

Jordan A. Comins: Concieved and designed the analysis; Collected the data; Contributed data or analysis tool; Performed the analysis; Wrote the paper.

Stephanie A. Carmack: Concieved and designed the analysis.

Loet Leydesdorff: Concieved and designed the analysis; Wrote the paper.

Acknowledgment

We thank three anonymous referees for comments.

References (48)

  • A. Agarwal et al.

    Commercial landscape of noninvasive prenatal testing in the United States

    Prenatal Diagnosis

    (2013)
  • C. Anderson

    The end of theory: The data deluge makes the scientific method obsolete

    Wired magazine

    (2008)
  • R.K. Bera

    The story of the Cohen–Boyer patents

    Current Science

    (2009)
  • K. Bergman et al.

    The global stem cell patent landscape: Implications for efficient technology transfer and commercial development

    Nature Biotechnology

    (2007)
  • K. Buchholz et al.

    The roots—A short history of industrial microbiology and biotechnology

    Applied Microbiology and Biotechnology

    (2013)
  • P.-L. Chang et al.

    Using patent analyses to monitor the technological trends in an emerging field of technology: A case of carbon nanotube field emission display

    Scientometrics

    (2010)
  • I.M. Cockburn et al.

    Are all patent examiners equal?: The impact of characteristics on patent statistics and litigation outcomes

    (2002)
  • J.A. Comins et al.

    Detecting seminal research contributions to the development and use of the global positioning system by reference publication year spectroscopy

    Scientometrics

    (2015)
  • J.A. Comins et al.

    RPYS i/o: Software demonstration of a web-based tool for the historiography and visualization of citation classics, sleeping beauties and research fronts

    Scientometrics

    (2016)
  • J.A. Comins et al.

    Citation algorithms for identifying research milestones driving biomedical innovation

    Scientometrics

    (2017)
  • D.J. de Solla Price

    Citation measures of Hard science, soft science, technology, and nonscience

  • K.J. Egelie et al.

    The emerging patent landscape of CRISPR-Cas gene editing technology

    Nature Biotechnology

    (2016)
  • B. Elango et al.

    Detecting the historical roots of tribology research: A bibliometric analysis

    Scientometrics

    (2016)
  • A. Fire et al.

    U.S. Patent No. 6,506,559

    (2003)
  • Cited by (0)

    1

    The author’s affiliation with The MITRE Corporation is provided for identification purposes only, and is not intended to convey or imply MITRE’s concurrence with, or support for, the positions, opinions or viewpoints expressed by the author. Approved for Public Release; Distribution Unlimited Case #17-0951.

    View full text