A new application of community detection for identifying the real specialty of physicians

https://doi.org/10.1016/j.ijmedinf.2020.104161Get rights and content

Highlights

  • We used the community detection method in an innovative way of identifying the real specialty of physicians in the Spark framework as a big data analysis tool. Besides, Scala and python programming languages were used for implementations.

  • This knowledge can help scientists accurately determine the real medical fields that the physicians prescribe in. Moreover, it could increase the accuracy of the prospective investigations on the prescription data.

  • As a part of pre-processing, this can be a solution for missing values of the physicians’ specialty field in prescription data.

Abstract

Background

There is an increasing trend in using network science methods and algorithms, including community detection methods, in different areas of healthcare. These areas include protein networks, drug prescriptions, healthcare fraud detection, and drug abuse. Counterfeit drugs, off-label marketing issues, and finding the healthcare community structures in a network of hospitals, are examples of using community detection in healthcare.

Objective

This paper attempts to find physicians’ real medical specialties based on their prescription history. As a novel application of community detection in the healthcare field, this knowledge can be used as an alternative for missing values of the healthcare databases. Therefore, it could help scientists and researchers to obtain more accurate and more reliable results.

Methods

This research is done through the community detection method and applying big data tools as well as interviews with the field experts. The big data, which is used in this paper, includes 32 million written medical prescriptions in the year 2014, provided by the Health Insurance Organization. The results are validated both qualitatively and quantitatively.

Results

The findings reveal nine major communities of physicians, and labeling these communities by experts presents almost every specialty in the drug prescriptions field. Some of these communities are labeled as a single well-known specialty, and some others are consist of two or more specialties that have overlap with each other.

Conclusion

After receiving the prescription data and getting the experts’ opinions, it was revealed that some physicians might persistently prescribe drugs that are not in their fields of expertise. Regarding the accuracy of community detection and the use of existing data values, we proved this hypothesis.

Introduction

In recent years, there is an increasing trend in using graph mining and network science methods in different scientific fields such as healthcare [[1], [2], [3], [4]]. One of the most attractive methods in these fields is the community detection method. In network science, a community is a group of nodes having stronger interrelationships compared to the other nodes. In the past decades, several algorithms and methodologies in the community detection field have been proposed. These algorithms and methods aim to achieve higher output accuracy or more favorable runtime complexity in large networks. One of the recently proposed community detection algorithms is the Louvain algorithm. Louvain is a well-known greedy algorithm for the community detection task in its general case for weighted graphs. Due to its properties, such as fast convergence, high modularity, and hierarchical partitioning, Louvain has been extensively utilized in many applications [5].

The production speed of routine data sources in healthcare databases has increased significantly in the past few years. This massive data might, however, not be mature enough for analytical or research purposes. Missing values and inefficient updates in these databases can reduce the accuracy and reliability of data. These databases, in addition, may contain anomalous patterns showing wasteful, abusive, or fraudulent activities.

In this research, we try to exploit a community detection approach in an innovative way to identify the real specialty of physicians, concerning the history of their prescriptions Based on our knowledge, this innovative approach has not yet been taken by other researchers, and this is the first study that uses the prescription data to identify the real specialty of physicians by implementing the community detection methods. As a new application of the community detection task in healthcare studies, this approach can be applied in other investigations and researches. The primary motivation of this study refers to the strategic importance of the attribute called “physicians’ specialty” in prescription databases. This attribute is of great importance for further investigations on prescription data in which around 40% of the entries related to this attribute are missing in our database. Also, as our investigation reveals, in some cases, there might be some physicians prescribing medications related to a specialty different from their original specialties.

The approach presented in this research has the capacity to be used as an alternative method for handling missing values in similar circumstances in other databases, which is highly essential in data pre-processing. Therefore, this study facilitates obtaining reliable results while investigating the prescription database for fraud detection, statistics, finding epidemiology features of society, medication-related investigations, and many other types of research related to the physicians’ specialty field.

In the following sections of this paper, a background of related studies will be presented. Then, we discuss the data pre-processing and the selecting process of the required features. Afterward, we delve into the implementation of the Louvain algorithm. Then, we illustrate the methods of enhancing modularity and demonstrate how it helps to find accurate communities. At last, with contemplating all the previous tasks, the experts of the field, classify the visualized communities.

Section snippets

Background

Recently, many algorithms have been proposed and improved in the community detection field. For Instance, Ravasz et al. [6] developed an algorithm to identify the molecular groups forming condensed matter communities. Proposing this algorithm was the first attempt toward the identification of such communities in metabolic networks [7]. Girvan and Newman [8] introduced the Girvan–Newman algorithm to remove the edges linking the nodes from different communities, systematically. Subsequently, this

Materials and methods

Methods and materials applied in all steps of this research are discussed in the following sections. In addition, owing to convey a proper perspective of the paper’s structure, Fig. 1 represents the workflow of the paper as a flow chart diagram.

Results

In the following sections, we elaborate on the results of the conferred method, systematically.

Discussion

After receiving the prescription data and getting experts’ opinions, it was revealed that some physicians might persistently prescribe drugs that are not in their fields of expertise. For example, assume an internist prescribes the medications intended for another specialty; although, this is not a fault of the mentioned physician by itself. Nevertheless, for a more accurate analysis of the prescriptions, one should ignore the original specialty of the physicians and only pay attention to the

Conclusion

In this paper, the authors investigated prescription data to identify the real specialty of physicians. This knowledge could help scientists for more accurate analytical activities in the field of physicians’ drug prescriptions. However, in doing so, we encountered obstacles and limitations, such as the existence of only a few informative features of data and some hardware limitations.

The knowledge provided by this study can pave the path toward more practical and useful findings. For future

Authors’ contribution

Mr. Saeed Shirazi contributed to every part of this research. Prof. Amir Albadvi and Dr. Farshad Farzadfar were involved in planning and supervising the research. Dr. Babak Teimourpour participated in developing the theory and implementation phase. Dr. Elham Akhondzadeh verified the analytical methods and outputs and helped with all the technical details. All authors have discussed and contributed to the final version of the manuscript.

Summary Points

Already known

  • For every physician, there is a

Authors disclosure statement

No competing financial interests exist.

Funding

This study has no financing available.

Acknowledgment

Thanks to Dr. Farshad Sharifi Assistant Professor in Elderly Health Research Center at Endocrinology and Metabolism Research of Tehran university and Miss Mina Dehghani, Ph.D. Candidate in Pharmacoeconomics and Pharmaceutical Administration at Tehran University of Medical Sciences for their help in validating this paper's output and staff of Endocrinology and Metabolism Research of Tehran University for their support.

References (35)

  • X. Que et al.

    Scalable Community Detection with the Louvain Algorithm

    (2015)
  • E. Ravasz

    Hierarchical organization of modularity in metabolic networks

    Science

    (2002)
  • A.-L. Barabási

    Chapter 9: Communities

    (2015)
  • M.E.J. Newman et al.

    Finding and Evaluating Community Structure in Networks

    (2004)
  • M.J. Rattigan et al.

    Graph clustering with network structure indices

    Int. Conf. Mach. Learn.

    (2007)
  • J.W. Pinney et al.

    Betweenness-based decomposition methods for social and biological networks

    Soft Matter

    (2005)
  • F. Radicchi et al.

    Defining and Identifying Communities in Networks

    (2003)
  • Cited by (0)

    View full text