A methodology for identifying breakthrough topics using structural entropy

https://doi.org/10.1016/j.ipm.2021.102862Get rights and content

Highlights

  • Identifying a scientific breakthrough early and helping to establish forward-looking predictions.

  • Depicting the non-linear characteristics of complex knowledge networks through structural changes.

  • Regarding the knowledge network as a complex system from a holistic perspective.

  • Observing the incubation mechanism of emergent scientific breakthroughs from a dynamic evolutionary perspective.

Abstract

This research uses link prediction and structural-entropy methods to predict scientific breakthrough topics. Temporal changes in the structural entropy of a knowledge network can be used to identify potential breakthrough topics. This has been done by tracking and monitoring a network's critical transition points, also known as tipping points. The moment at which a significant change in the structural entropy of a knowledge network occurs may denote the points in time when breakthrough topics emerge. The method was validated by domain experts and was demonstrated to be a feasible tool for identifying scientific breakthroughs early. This method can play a role in identifying scientific breakthroughs and could aid in realizing forward-looking predictions to provide support for policy formulation and direct scientific research.

Introduction

Scientific breakthroughs allow research to be channeled in directions that were previously inaccessible. Thus, such developments are highly innovative and represent the forefront of scientific research. The early identification of breakthrough topics is important for policy formulation and strategic management by governments, businesses, and other organizations. Identifying major breakthroughs early gives policy-makers ample time to react. Compared to incremental innovation, breakthrough innovation is more difficult to identify because it takes time to recognize a development and gather the information needed for analysis. In some cases, luck and intuition play a role in discovering this information, especially when serendipity is involved (van Andel, 1994; Merton et al., 2004; Fukawa, 2006; Seymour, 2009).

Studies have primarily explored the identification of scientific breakthroughs using qualitative methods that depend on expert judgment. However, regarding the background of interdisciplinary integration, relying solely on experts is excessively time-consuming and often leads to contradictory results. An abundance of powerful data-processing resources, analytical tools, and algorithms have emerged that provide effective support for, or alternatives to, expert opinion. Currently, most methods used to identify breakthrough topics rely heavily on specific attributes (e.g., recency, novelty, integrative, knowledge configurability, and word frequency). However, these methods fail to consider processes of gestation, propagation, development, and mutation.)

The early identification of scientific breakthroughs relies on a deep understanding of the laws of technological innovation. Scientific innovation is a nonlinear developmental process, and different knowledge topics interrelate to form complex networks. In these networks, newly emerging topics affect the structure of the original network (Chen, 2012, 2015; Dahlin & Behrens, 2005; Wan, 2017; Xu et al. 2019; Xu et al. 2021; Zhang et al. 2021). Thus, the greater the degree of innovation, the greater is the impact of the topic on the knowledge network structure, thus, scientific breakthroughs always affect the structure of knowledge networks (Luo, 2020). Therefore, evaluating their impact on knowledge-network structures can identify potential breakthrough topics. Structural entropy, considers the number of communities and their sizes, to encapsulate a richer representation of the network's structure into a single value (Almog et al., 2019), conversely, it regards the network holistically and dynamically as a complex system by considering the evolutionary processes (Guan, 2014; Huo, 2019; Xu et al. 2019; Xu et al. 2021), therefore, it would help to identify scientific breakthroughs during their early stages. The increase in the network structure's entropy can be an indicator of the network's evolution, in addition, the growth rate is high during early network evolution but as it grows, the network structure stabilizes, and the growth rate decreases (Luo et al., 2013).

In this study, we attempt to combine link prediction and structural entropy methods to predict scientific breakthrough topics. Compared with extant prediction indicators, the structural entropy index regards the knowledge network as a complex system from a holistic perspective. This method is used to recognize emerging topics that significantly impact the knowledge-network structure and regards them as potential breakthrough topics. Finally, the prediction results of our proposed method are evaluated by domain experts to assess their validity and reliability.

Our objective is to provide a method that allows the detection of scientific breakthroughs in their early stages. The method is based on tracking and monitoring critical points in the evolution of complex knowledge networks to identify potential breakthrough topics. The moment at which a significant change in a network's structural entropy occurs i.e., the tipping point, can be used to denote the point at which breakthrough topics emerge. Therefore, unlike the extant prediction indicators, the structural entropy index holistically regards the knowledge network as a complex system. Our proposed method considers the mechanism of the emergence of scientific breakthroughs from a process perspective, and we believe this is beneficial for the early identification of scientific breakthrough topics. In this respect, our method stands out from most current research as they are inclined toward extant hot-topic monitoring and are not forward-looking.

The remainder of this paper is organized as follows. First, this study details the characteristics of scientific breakthroughs and reviews extant prediction methods. Second, we explain the principles and processes of constructing the prediction model based on structural entropy and link prediction. Then, the field of genetically engineered vaccines (GEV) is used as a testbed, and the prediction results are compared with expert evaluations. Finally, we sum up the advantages and disadvantages and elaborate on future research directions.

Section snippets

The highlights of this study are

Identifying a scientific breakthrough early and helping to establish forward-looking predictions.

Depicting the non-linear characteristics of complex knowledge networks through structural changes.

Regarding the knowledge network as a complex system from a holistic perspective.

Observing the incubation mechanism of emergent scientific breakthroughs from a dynamic evolutionary perspective.

Theoretical framework

This section introduces the theoretical framework of our research. This framework is based on the existing scientific literature on scientific breakthroughs and on the methods for identifying scientific breakthroughs gleaned from the literature. We also classify scientific breakthroughs based on the knowledge network to which they belong. In addition, a new method is introduced that should overcome the shortcomings of existing methods.

The concept of a scientific breakthrough

No single definition of the concept of a scientific breakthrough is agreed on by all scholars. Scientific breakthroughs are often related to scientific revolutions and changes in scientific paradigms (Kuhn, 1962; Fortunato et al., 2018; Min, Bu, & Sun, 2021). Breakthrough developments are considered to overcome obstacles for further scientific and technical progress. Scientific breakthroughs may produce new theories or may improve existing ones (Merton, 1973; Wray, 2011). The cognitive

Dynamics of science and technology

The way knowledge is produced, organized, and disseminated in science, technology, social science, and humanities are fundamental issues in the dynamics of science (Coccia, 2020a, Gibbons et al., 1994). Multiple studies have proved that the dynamics of science is dependent on manifold factors, including historical, institutional, political, and research contexts, such as mental ability, the existing status of culture, institutions, and research funding (Börner et al. 2012, Coccia et al. 2015,

Extant identification methods

Many scholars have studied how to identify and predict breakthrough innovations from different perspectives. The main commonly used quantitative methods for detecting groundbreaking research frontiers are citation analysis, topic mutation analysis, sleeping beauty literature analysis, technical evolution methods, and analyses based on machine learning models. Table 1 shows the characteristics and the advantages and disadvantages of the different methods.

The influence of breakthroughs on the structure of knowledge networks

From the perspective of network dynamics, the entry of this new information influences the structure of the cognitive network into which it is induced. Also, the stability of the network may be influenced. According to Chen (2012), the impact of new knowledge on the original knowledge structure can be measured by the degree of structural change. Dahlin and Behrens (2005) assessed the structural differences of patent-citation networks to identify scientific breakthroughs, noting that the greater

The evolution of network structures

‘Weak ties’ refers to the early characteristics of interdisciplinary and technological integration. In a network structure, a weak tie refers to the type of node relationship of which the strength is lower than a given threshold. Thus, it is often considered to be the opposite of a strong tie. Granovetter (1983) demonstrated that a strong tie could maintain a relationship within an organization, whereas a weak one could link different groups and organizations for information transfer. Weak

Knowledge networks and structural entropy theory

We argue the proposition that (radical) changes in entropy signal breakthroughs. Scientific breakthroughs change the structure of the citation network of publications, but there may be other mechanisms that result in entropy changes that are not ‘breakthroughs’. We focus on 1) the scientific knowledge network, 2) the relationship between the state of the knowledge network and structural entropy theory, 3) structural entropy indicators and 4) on using link prediction to predict the evolution of

Material

This study selected genetically engineered vaccines (GEVs) as the experimental field. Research into GEVs is expected to be the source of innovative vaccines. The scientific impact is confined to academic impact, ignoring societal, economic, and other effects. Therefore, instead of analyzing patents and other information journal papers were chosen as the empirical data. Scientific papers from the GEV field were collected to provide the analysis datasets. The Web of Science database was selected

Methods

Using scientific knowledge networks to explore scientific dynamics

Science can be seen as an evolving system of diverse basic units of science that are tightly linked and dynamically coupled. Networks can represent the collective, self-organized emerging structures in science, and allow the linking of structural properties to dynamic processes (Börner et al. 2009). The scientific knowledge network is a complex system, including three types of networks: ontology networks, knowledge subject

Data processing

Data preprocessing

After removing duplicate data, 4,374 records were retained. Figure 2 displays the statistical analysis of these data. The number of papers in the GEV field increased after 1990, until entering a steady growth state in 1993, recently reaching its publication peak. From 1989 to 1991, the field experienced especially rapid growth. From an additional examination of titles and keywords, we found that, for papers newly published in 1991, the main GEV research focused on

Discussion

Our research gives insight into the dynamics of science because the structural entropy method regards networks as complex systems from a holistic and dynamic perspective and considers network growth through the lens of dynamic evolutionary processes. Moreover, structural entropy can measure the changes in both the strong and weak ties between topics in the domain knowledge network. Thus, it can function as an early signal of domain topic changes and help to identify a scientific breakthrough

Conclusion

A structural entropy measurement index for knowledge-network assessment that is based on network structures and their nonadditive characteristics, was constructed from the perspective of changing knowledge networks. This research attempts to combine link prediction and structural entropy methods to predict scientific breakthrough topics. Through the temporal-value change of structural entropy in a network, changes can be monitored to identify potential breakthrough topics. An empirical analysis

Future work

Based on the findings of this study we foresee three future research activities: 1) a revised assessment, 2) focusing on more effectively capturing the early weak signals of scientific breakthroughs, and 3) paying attention to the relevant dynamic evolutionary processes.

Acknowledgments

We appreciate the constructive advice and suggestions from Professor Shuo Xu at the College of Economics and Management, Beijing University of Technology. This article is an outcome of the Taishan Scholars Youth Expert Program of Shandong Province (202103069). The “Study on the Recognition Method of Innovative Evolving Trajectory based on Topic Correlation Analysis of Science and Technology” (No.71704170) is supported by the National Natural Science Foundation of China, China Postdoctoral

References (137)

  • Y. Liu et al.

    A Review of Early Recognition of Breakthrough Innovations and the Weak Signal Analysis

    Library and Information Service

    (2021)
  • C. Min et al.

    Predicting scientific breakthroughs based on knowledge structure variations

    Technological Forecasting and Social Change

    (2021)
  • C. Min et al.

    Identifying citation patterns of scientific breakthroughs: a perspective of dynamic citation process

    Information Processing & Management

    (2021)
  • P. Savov et al.

    Identifying breakthrough scientific papers

    Information Processing & Management

    (2020)
  • H. Small et al.

    Identifying emerging topics in science and technology

    Research Policy

    (2014)
  • J. Adams

    The rise of research networks

    Nature

    (2012)
  • J. Adams

    The fourth age of research

    Nature

    (2013)
  • A. Almog et al.

    Structural entropy: monitoring correlation-based networks over time with application to financial markets

    Scientific reports

    (2019)
  • H. Andersen et al.

    The cognitive structure of scientific revolutions

    (2006)
  • W.B. Arthur

    The nature of technology: What it is and how it evolves

    (2009)
  • V.D. Blondel et al.

    Fast unfolding of communities in large networks

    Journal of statistical mechanics: theory and experiment

    (2008)
  • K. Börner et al.

    An introduction to modeling science: basic model types, key definitions, and a general framework for the comparison of process models

  • K. Börner et al.

    Modeling science: Studying the structure and dynamics of science

    Scientometrics

    (2011)
  • Boya, S. C. B. (2018). Use stem cell therapy to treat cancer, Alzheimer's disease and other diseases. Retrieved from...
  • K.W. Boyack et al.

    Mapping the backbone of science

    Scientometrics

    (2005)
  • K. Brad Wray

    Kuhn and the Discovery of Paradigms

    Philosophy of the Social Sciences

    (2011)
  • M. Cai et al.

    A new network structure entropy based node difference and edge difference

    Acta Physica Sinica

    (2011)
  • Y. Cai et al.

    Influences of Power Grid Structure on Cascading Failure Based on Standard Structure Entropy

    Transactions of China Electrotechnical Society

    (2015)
  • K. Cao et al.

    Tsallis entropy and nonex tensive statistical mechanics

    Journal of Yunnan University

    (2005)
  • C. Chen

    Turning points: The nature of creativity

    (2012)
  • C. Chen et al.

    Turning Points: the nature of creativity

    (2015)
  • 2019 Scientific Development Report

    (2020)
  • Clausius, R. (1865). Presentation to the Philosophical Society of Zurich. Retrieved from...
  • M. Coccia

    General properties of the evolution of research fields: a scientometric study of human microbiome, evolutionary robotics and astrobiology

    Scientometrics

    (2018)
  • M. Coccia

    Theories and laws of scientific development

    Working Paper CocciaLab n. 50/2020

    (2020)
  • M. Coccia et al.

    Emerging nanotechnological research for future pathway of biomedicine

    International Journal of Biomedical nanoscience and nanotechnology

    (2012)
  • M. Coccia

    The evolution of scientific disciplines in applied sciences: dynamics and empirical properties of experimental physics

    Scientometrics

    (2020)
  • M. Coccia et al.

    Human progress and its socioeconomic effects in society

    Journal of Economic and Social Thought

    (2018)
  • L.D.F. Costa et al.

    Complex networks: the key to systems biology

    Genetics and Molecular Biology

    (2008)
  • L.d.F. Costa et al.

    Characterization of complex networks: A survey of measurements

    Advances in physics

    (2007)
  • MIT Technology Review"2020 "Top Ten Global Breakthrough Technologies"

    Chinese Technology Business

    (2020)
  • H. Dong et al.

    A 2D Structure Entropy-based Approach to Security Assessment of Communication-based Train Control System

    Acta Automatica Sinica

    (2019)
  • H. Du

    Progress in International Research and Development of Therapeutic Cancer Vaccine

    Progress in Pharmaceutical Sciences

    (2018)
  • J. Du

    A study on systematic identification of “Sleeping Beauty” publications and on their awaking mechanism

    (2017)
  • O. Eulaerts et al.

    Weak signals in Science and Technologies - 2019 Report, EUR 29900 EN, Publications Office of the European Union

    (2019)
  • D. Fanelli et al.

    Bibliometric evidence for a hierarchy of the sciences

    PLoS ONE

    (2013)
  • S. Fortunato et al.

    Science of science. Science

    (2018)
  • Y. Fu et al.

    Breakthrough innovation: concept definition and comparison

    The Journal of Quantitative & Technical Economics

    (2004)
  • I. Fukawa

    Case studies on how to enhance the chance of technical breakthrough and (pseudo) serendipity

  • R.J. Funk et al.

    A dynamic network measure of technological change

    Management Science

    (2017)
  • Cited by (14)

    • Revisiting the exploration-exploitation behavior of scholars' research topic selection: Evidence from a large-scale bibliographic database

      2022, Information Processing and Management
      Citation Excerpt :

      After analyzing the career paths of scientists from the computer science field, Bu et al. (2018) found that the collaborators of high-impact scientists prefer to study diverse topics. Moreover, both funding agencies and individual researchers are interested in identifying and following emerging research topics (Liang et al., 2021; Lu et al., 2021; Tu and Seng, 2012; Xu et al., 2022; Yang et al., 2022). Keshavarz and Shekari (2020) recently collected and analyzed interview data from 391 postgraduate students.

    View all citing articles on Scopus
    View full text