Investigating classification supervised learning approaches for the identification of critical patients’ posts in a healthcare social network

https://doi.org/10.1016/j.asoc.2020.106155Get rights and content

Highlights

  • Healthcare social networks allow enhancing patient care and education.

  • Possible distribution of wrong information causes risks for patients.

  • Classification of critical patients’ posts is performed via supervised learning.

  • Critical post identification triggers the intervention of medical operators.

Abstract

Nowadays, Healthcare Social Networks (HSNs) offer the possibility to enhance patient care and education. However, they also present potential risks for patients due to the possible distribution of poor-quality or wrong information along with their bad interpretation. On one hand doctors and practitioners want to promote the exchange of information among patients about a specific disease, but on the other hand they do not have enough time to read patients’ posts and moderate them when required. In this paper, we investigate and compare different supervised learning classifiers that we adopted for the classification of critical patients’ posts who can trigger the intervention of the medical personnel. In particular, by considering different Bayesian, Linear and Support Vector Machine (SVM) classifiers we analyze their accuracy considering different n-grams datasets preparation approaches in order to identify the best approach for the identification of critical patients’ posts in a Healthcare Social Network.

Introduction

Nowadays, all over the world, the number of investments in Information and Communication Technology (ICT) for health, aging and well-being (eHealth) is rapidly increasing. Global eHealth market is expected to reach USD 308.0 billion by 2022, according to a recent report by Grand View Research Inc. In particular, the transition of the healthcare industry into digital healthcare system for the management and analysis of patients’ health is expected to be the most vital driver for the market [1]. The European Commission’s eHealth Action Plan 2012–2020 has already provided a roadmap to empower patients and healthcare workers, to link up devices and technologies, and to invest in research towards the personalized medicine of the future [2]. In 2015, the European initiative called FICHE had the purpose to accelerate small medium enterprises for the development of new cutting-edge eHealth applications by means of the FIWARE technology [3]. In 2017, the European Commission set up an internal task force bringing together technology and health policy makers to examine EU policy actions to ensure the transformation of health care into a Digital Single Market (DSM) bringing benefits for people, health care systems and the economy [4]. Guaranteeing access to high-quality health care is a key objective of social protection systems in European countries and it represents the second largest social expenditure item after pensions. In this panorama, social media represent a tempting opportunity for healthcare operators for improving the patients’ well-being. Currently, many social media tools are available over the Internet such as social networking, professional networking, media sharing, content production including blogs (e.g., Tumblr) and micro-blogs (e.g., Twitter), knowledge/information aggregation (e.g., Wikipedia), virtual reality and gaming environments (e.g., second life). In particular, many Healthcare Social Networking (HSN) platforms have emerged with the purpose to enhance patient care and education. Popular HSN platforms include Sermo, Doximity, Orthomind, QuantiaMD, WeMedUp, Digital Healthcare and etc. However, these social networks require the massive action of medical personnel acting as moderator. In fact, healthcare social networks present potential risks for patients due to the possible distribution of poor-quality or wrong information along with their bad interpretation. On one hand, clinical operators want to promote the exchange of information among patients about a specific disease, but on the other hand, they do not have enough time to read patients’ posts and moderate them when required. Benefits includes:

  • promoting networking and information exchange enabling self-education among patients about particular diseases.

  • sharing patients’ experiences that can be helpful for other ones;

  • supporting the treatment process;

  • reducing the patient’s stress when he/she is waiting for a diagnosis or when he/she discovers to be affected by a particular disease;

  • promoting information gathering and prevention campaign regarding specific diseases;

  • optimizing the work of the clinical personnel who interact with patients skilled in their diseases;

  • promoting knowledge management;

  • promoting research and monitoring activities.

  • On the other hand, HSNs present several risks including:

  • possible distribution of poor-quality or wrong information among patients;

  • need of qualified medical personnel who promptly read patients’ posts and reply them;

  • often the medical operators do not have the time to read patients’ posts and to reply them;

  • the medical operators do not want the responsibility of the consequences on patients (worsening, risk of death or death) when they do not reply in time.

  • possible legal issues for the medical personnel;

  • risks for the reputation of the medical personnel.

In our previous scientific work [5], in order to mitigate the aforementioned risks, we proposed a Patients’ Posts Moderator (PPM) architecture blueprint whose basic flowchart is shown in Fig. 1. Motivated by the fact that currently cognitive computing is emerging in all ICT fields [6], the basic idea around this scientific work was to carry out an automatic analysis of patients’ posts by means machine learning techniques in order to identify possible critical issues, hence helping medical operators to carry out actions when required. Both patients and medical personnel interact by means of a HSN platform. A Patients’ Posts Analysis System (PaPAS) works as a batch process that continuously analyzes patients’ posts of a HSN platform. When a critical issue is detected, it generates an event that is caught by a Complex Event Processing (CEP) component that elaborates it. An alert message is then sent to the interested medical personnel who can step in the HSN platform, replying to critical patients’ discussion groups and/or triggering medical interventions (doctors can directly contact the patient or send ambulance with a medical equipment if required). The main purpose of PaPAS is to analyze patients’ posts and evaluate possible critical issues that may trigger clinicians’ intervention. It includes the following sub-components: (i) Extractor, whose role is to extract patients’ content from the HSN platform; (ii) Selector, whose role is to select relevant keywords; (iii) Rank Generator, whose role is to rank selected keywords; (iv) Categorisator, whose role is to categorize the various levels of seriousness; (v) Classificator, whose role is to classify patients’ posts according to different categories; and (vi) Evaluator, whose role is to assess results’ quality.

This paper extends [5], specifically focusing on the Classificator sub-component. In particular, we implemented and compared different classification supervised learning algorithms (referred as classifiers) that we adopted for the classification of critical patients’ posts containing poor-quality or wrong information that can trigger the intervention of the medical personnel. The choice to consider supervised learning algorithms instead of unsupervised ones is motivated by the fact that the scientific literature has demonstrated that they well suit typical classification problems. Specifically, defining an n-gram as a contiguous sequence of n words in patients’ posts, we arranged several datasets, with which we trained Bayesian, Linear and Support Vector Machine (SVM) classifiers in order to analyze their accuracy.

The remainder of the paper is organized as follows. In Section 2 we provide a brief overview of the major recent initiatives in the fields of machine learning and social media for eHealth. In Section 3, we present the adopted training dataset and software tools used to train classifiers. In Section 4, we present the adopted method focusing on data collection, dictionary arrangement, dataset preparation and choice of classifiers. Experimental results and discussion are provided in Section 5. Conclusion and future developments are summarized in Section 6.

Section snippets

Background and related work

Social media aimed at improving healthcare quality is an emerging research topic [7]. In this Section, we provide a brief description regarding: (i) the impact and benefits of social media in healthcare; (ii) how social media are revolutionizing the whole healthcare marketing; (iii) the most recent best practices experienced in clinical centers; (iv) potential risks for patients; (v) recent initiatives regarding the use of Twitter in healthcare; (vi) data mining in healthcare, and (vii) the use

Training dataset and analysis tools

Before selecting the dataset with which the classifier istrained, we needed to choice the social network source.

Twitter is a famous social network. It is based on tweets, microblogs of 280 characters that can be shared (retweeted) or commented, thus feeding a chat. Tweets can include one or more selected hashtags (#) that are very important when topics-based data mining needs to be carried out.

Among the available hashtags available in the healthcare domain, we focused on the Child Sex Abuse

Classifiertraining method

The classification process follows the phases highlighted in Fig. 2. It helps to analyze the measurements of a generic object in order to identify the category or class to which that object belongs to. Typical examples of classification problem includes classification of credit card requests, placement of patient in a specific intensive-care unit, and so on. In our case study, object to be classified are users tweets related to the #CSAQT hashtag. #CSAQT tweets are pre-processed to remove urls,

Experimental results

The experiments have been carried out by comparing six classification algorithms, already described in Section 4, for three datasets, considering the notincr, incr2step and incr approaches.

Fig. 3, Fig. 3 shows how the vocabulary sizes changed considering respectively in train and test phases, according to the three cycling approaches used to build datasets and number of n-grams, whereas Fig. 3(c) shows how the vocabulary sizes changed considering both train and test phases, according to

Conclusion and future developments

In this work, we investigated a NLP approach in a real HSN case study based on Twitter. The idea behind the analysis was to classify tweets according to three message levels, i.e., alarm, notalarm and suspect, in order to create a tool, aimed at both patients, their familiars, and medical operators, and able to address emergency events warning possible changes of the patients’ health status.

The analysis followed three different n-grams datasets preparation approaches, called notincr, incr2step

CRediT authorship contribution statement

Lorenzo Carnevale: Investigation, Software, Data curation, Writing - original draft. Antonio Celesti: Conceptualization,Methodology, Investigation, Validation, Writing - original draft, Writing - reviewing & editing. Giacomo Fiumara: Conceptualization, Methodology, Investigation, Writing- original draft. Antonino Galletta: Validation, Visualization. Massimo Villari: Visualization, Supervision.

Declaration of Competing Interest

No author associated with this paper has disclosed any potential or pertinent conflicts which may be perceived to have impending conflict with this work. For full disclosure statements refer to https://doi.org/10.1016/j.asoc.2020.106155.

Acknowledgment

This work was supported by the Italian Healthcare Ministry founded project Young Researcher (under 40 years) entitled “Do Severe acquired brain injury patients benefit from Telerehabilitation? A Cost-effectiveness analysis study” - GR-2016-02361306.

References (46)

  • G. Fiumara, A. Celesti, A. Galletta, L. Carnevale, M. Villari, Applying Artificial Intelligence in Healthcare Social...
  • WanJ. et al.

    Intelligent equipment design assisted by cognitive internet of things and industrial big data

    Neural Comput. Appl.

    (2018)
  • RanneyM. et al.

    Social media and healthcare quality improvement: A nascent field

    BMJ Qual. Saf.

    (2016)
  • SmailhodzicE. et al.

    Towards new social media logic in healthcare and its interplay with clinical logic

  • SmailhodzicE. et al.

    Social media disruptive change in healthcare: Responses of healthcare providers?

  • SmailhodzicE. et al.

    Social media use in healthcare: A systematic review of effects on patients and on their relationship with healthcare professionals

    BMC Health Serv. Res.

    (2016)
  • MalveyD. et al.

    Healthcare marketing and social media

  • OpelD.

    Ethical information flows: Working with/against the healthcare industry’s fascination with social media

  • KoumpourosY. et al.

    The importance of patient engagement and the use of social media marketing in healthcare

    Technol. Health Care

    (2015)
  • BugrezovaE.

    The social media contribution into healthcare practices among Russian young people

    Ekon. Sotsiologiya

    (2016)
  • AbdullatifA. et al.

    Evolution of social media in scientific research: A case of technology and healthcare professionals in Saudi Universities

    J. Med. Imaging Health Inform.

    (2017)
  • Lee VentolaC.

    Social media and health care professionals: Benefits, risks, and best practices

    P and T

    (2014)
  • Hors-FraileS. et al.

    The unintended consequences of social media in healthcare: New problems and new solutions

    Yearb. Med. Inform.

    (2016)
  • Cited by (16)

    • Intelligent wearable healthcare monitoring framework: Trends in sensor-deep learning approaches

      2023, Investigations in Pattern Recognition and Computer Vision for Industry 4.0
    • Assessing the Usage of Various Data Mining Techniques for Analysis of Online Social Networks

      2023, AI-Based Data Analytics: Applications for Business Management
    View all citing articles on Scopus

    This paper is an extended, improved version of the paper “Applying Artificial Intelligence in Healthcare Social Networks to Identity Critical Issues in Patients’ Posts” presented at AI4Health 2018 workshop and published in: BIOSTEC 2018, Proceedings of the 11th International Joint Conference on Biomedical Engineering Systems and Technologies, Volume 5: HEALTHINF, Funchal, Madeira, Portugal, 19–21 January, 2018, pp. 680-687, ISBN: 978-989-758-281-3, INSTICC, 2018.

    1

    on behalf of GNCS—Gruppo Nazionale per il Calcolo Scientifico - INdAM.

    View full text