research-article

Extracting Features from Online Forums to Meet Social Needs of Breast Cancer Patients

Authors:
Maitreyi Mokashi

School of Informatics and Computing, Indiana University Purdue University Indianapolis, Indianapolis, IN

School of Informatics and Computing, Indiana University Purdue University Indianapolis, Indianapolis, IN
View Profile

,
Enming Zhang

School of Informatics and Computing, Indiana University Purdue University Indianapolis, Indianapolis, IN

School of Informatics and Computing, Indiana University Purdue University Indianapolis, Indianapolis, IN
View Profile

,
Josette Jones

School of Informatics and Computing, Indiana University Purdue University Indianapolis, Indianapolis, IN

School of Informatics and Computing, Indiana University Purdue University Indianapolis, Indianapolis, IN
View Profile

,
Sunandan Chakraborty

School of Informatics and Computing, Indiana University Purdue University Indianapolis, Indianapolis, IN

School of Informatics and Computing, Indiana University Purdue University Indianapolis, Indianapolis, IN
View Profile

COMPASS '20: Proceedings of the 3rd ACM SIGCAS Conference on Computing and Sustainable SocietiesJune 2020Pages 198–207https://doi.org/10.1145/3378393.3403652

Published:01 July 2020Publication History

COMPASS '20: Proceedings of the 3rd ACM SIGCAS Conference on Computing and Sustainable Societies

Pages 198–207

ABSTRACT

Breast cancer patients go through many ordeals when they undergo treatments. Many of these issues are personal, social, or professional. As many of them are not directly medical in nature, these issues are not discussed with their healthcare providers and hence, not included in their treatment plan. However, these issues are vital for the patients' complete recovery. We present a novel approach that acts as the first step in including such personal and social issues resulting from breast cancer treatment into a patient's treatment plan. There are numerous online forums where patients share their experiences and post questions about their treatments and subsequent side effects. We collected data from one such forum called "Online Breast Cancer Forum". On this forum, users (patients) have created threads across many related topics and shared their experiences and questions. We use these message threads to identify critical issues faced by the patient and how they are related to their treatment. We convert the forum data into a bipartite network and turn the network nodes into a high-dimensional feature space. In this feature space, we perform community detection to unearth latent connections between patients and topics. We claim that these latent connections, along with the known ones, will help to create a new knowledge base that will eventually help physicians to estimate non-medical issues for a prescribed treatment. This new knowledge will help the physicians plan a more adaptive and personalized treatment and be better prepared by anticipating potential problems beforehand. We evaluated our method on two baseline methods and show that our method outperforms the baseline methods by 25% on a manually labeled reference dataset.

References

Nikolaos Aletras and Mark Stevenson. 2013. Evaluating topic coherence using distributional semantics. In Proceedings of the 10th International Conference on Computational Semantics (IWCS 2013)-Long Papers. 13--22.Google Scholar
Eiji Aramaki, Sachiko Maskawa, and Mizuki Morita. 2011. Twitter catches the flu: detecting influenza epidemics using Twitter. In Proceedings of the conference on empirical methods in natural language processing. Association for Computational Linguistics, 1568--1576.Google ScholarDigital Library
Danielle H Bodicoat, Minouk J Schoemaker, Michael E Jones, Emily McFadden, James Griffin, Alan Ashworth, and Anthony J Swerdlow. 2020. Correction to: Timing of pubertal stages and breast cancer risk: the Breakthrough Generations Study. Breast Cancer Research 22, 1 (2020), 1--2.Google ScholarCross Ref
Stacy M Carter, L Claire Hooker, and Heather M Davey. 2009. Writing social determinants into and out of cancer control: an assessment of policy practice. Social science & medicine 68, 8 (2009), 1448--1455.Google Scholar
Nitesh V Chawla and Darcy A Davis. 2013. Bringing big data to personalized healthcare: a patient-centered framework. Journal of general internal medicine 28, 3 (2013), 660--665.Google ScholarCross Ref
Immad Dabbura. 2018. K-means Clustering: Algorithm, Applications, Evaluation Methods, and Drawbacks. https://towardsdatascience.com/k-means-clustering-algorithm-applications\protect\discretionary{\char\hyphenchar\font){}{}evaluation-methods-and-drawbacks-aa03e644b48a.Google Scholar
Habib Dhahri, Eslam Al Maghayreh, Awais Mahmood, Wail Elkilani, and Mohammed Faisal Nagi. 2019. Automated Breast Cancer Diagnosis Based on Machine Learning Algorithms. Journal of Healthcare Engineering 2019 (2019).Google Scholar
Ming Gao, Leihui Chen, Xiangnan He, and Aoying Zhou. 2018. Bine: Bipartite network embedding. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. 715--724.Google ScholarDigital Library
Lori J Goldstein, Raymond P Perez, Denise Yardley, Linda K Han, James M Reuben, Hui Gao, Susan McCanna, Beth Butler, Pier Adelchi Ruffini, Yi Liu, et al. 2020. A window-of-opportunity trial of the CXCR1/2 inhibitor reparixin in operable HER-2-negative breast cancer. Breast Cancer Research 22, 1 (2020), 1--9.Google Scholar
William B Grant. 2020. Lower vitamin D status may help explain why black women have a higher risk of invasive breast cancer than white women. Breast Cancer Research 1 (2020), 1--2.Google Scholar
Jeremy A Greene, Niteesh K Choudhry, Elaine Kilabuk, and William H Shrank. 2011. Online social networking by patients with diabetes: a qualitative evaluation of communication with Facebook. Journal of general internal medicine 26, 3 (2011), 287--292.Google ScholarCross Ref
Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining. 855--864.Google ScholarDigital Library
Robert A Hiatt and Nancy Breen. 2008. The social determinants of cancer: a challenge for transdisciplinary science. American journal of preventive medicine 35, 2 (2008), S141-S150.Google ScholarCross Ref
Fang Hu, Jia Liu, Liuhuan Li, and Jun Liang. 2019. Community detection in complex networks using Node2vec with spectral clustering. Physica A: Statistical Mechanics and its Applications (2019), 123633.Google Scholar
Keyuan Jiang and Yujing Zheng. 2013. Mining twitter data for potential drug effects. In International conference on advanced data mining and applications. Springer, 434--443.Google ScholarCross Ref
Z. Jin, R. Liu, Q. Li, D. D. Zeng, Y. Zhan, and L. Wang. 2016. Predicting user's multi-interests with network embedding in health-related topics. In 2016 International Joint Conference on Neural Networks (IJCNN). 2568--2575. https://doi.org/10.1109/IJCNN.2016.7727520Google ScholarCross Ref
Josette Jones, Meeta Pradhan, Masoud Hosseini, Anand Kulanthaivel, and Mahmood Hosseini. 2018. Novel Approach to Cluster Patient-Generated Data Into Actionable Topics: Case Study of a Web-Based Breast Cancer Forum. JMIR medical informatics 6, 4, e45.Google Scholar
Aditya Joshi, Xiang Dai, Sarvnaz Karimi, Ross Sparks, Cecile Paris, and C Raina MacIntyre. 2018. Shot or not: Comparison of NLP approaches for vaccination behaviour detection. In Proceedings of the 2018 EMNLP Workshop SMM4H: The 3rd Social Media Mining for Health Applications Workshop & Shared Task. 43--47.Google ScholarCross Ref
Mohamad Abdolahi Kharazmi and Morteza Zahedi Kharazmi. 2017. Text coherence new method using word2vec sentence vectors and most likely n-grams. In 2017 3rd Iranian Conference on Intelligent Systems and Signal Processing (ICSPIS). IEEE, 105--109.Google ScholarCross Ref
Munui Kim, Seung Han Baek, and Min Song. 2018. Relation extraction for biological pathway construction using node2vec. BMC bioinformatics 19, 8 (2018), 206.Google Scholar
Konstantina Kourou, Themis P Exarchos, Konstantinos P Exarchos, Michalis V Karamouzis, and Dimitrios I Fotiadis. 2015. Machine learning applications in cancer prognosis and prediction. Computational and structural biotechnology journal 13 (2015), 8--17.Google Scholar
Jey Han Lau, David Newman, and Timothy Baldwin. 2014. Machine reading tea leaves: Automatically evaluating topic coherence and topic model quality. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics. 530--539.Google ScholarCross Ref
Lu Li, Wei Wang, Shuo Yu, Liangtian Wan, Zhenzhen Xu, and Xiangjie Kong. 2017. A Modified Node2vec Method for Disappearing Link Prediction. In 2017 IEEE 15th Intl Conf on Dependable, Autonomic and Secure Computing, 15th Intl Conf on Pervasive Intelligence and Computing, 3rd Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech). IEEE, 1232--1235.Google Scholar
Youguo Li and Haiyan Wu. 2012. A clustering method based on K-means algorithm. Physics Procedia 25 (2012), 1104--1109.Google ScholarCross Ref
Christopher D Manning, Prabhakar Raghavan, and Hinrich Schütze. 2008. Introduction to information retrieval. Cambridge university press.Google Scholar
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv.1301.3781 (2013).Google Scholar
David Mimno, Hanna M Wallach, Edmund Talley, Miriam Leenders, and Andrew McCallum. 2011. Optimizing semantic coherence in topic models. In Proceedings of the conference on empirical methods in natural language processing. Association for Computational Linguistics, 262--272.Google ScholarDigital Library
Francois Modave, Yunpeng Zhao, Janice Krieger, Zhe He, Yi Guo, Jinhai Huo, Mattia Prosperi, and Jiang Bian. 2019. Understanding Perceptions and Attitudes in Breast Cancer Discussions on Twitter. Studies in health technology and informatics 2019 (08 2019). https://doi.org/10.3233/SHTI190435Google Scholar
Laura Nyblade, Melissa A Stockton, Kayla Giger, Virginia Bond, Maria L Ekstrand, Roger Mc Lean, Ellen MH Mitchell, E Nelson La Ron, Jaime C Sapag, Taweesap Siraprapasiri, et al. 2019. Stigma in health facilities: why it matters and how we can change it. BMC medicine 17, 1, 25.Google Scholar
Jungsik Park and Young Uk Ryu. 2014. Online discourse on fibromyalgia: text-mining to identify clinical distinction and patient concerns. Medical science monitor: international medical journal of experimental and clinical research 20 (2014), 1858.Google Scholar
Jiajie Peng, Jiaojiao Guan, and Xuequn Shang. 2019. Predicting Parkinson's disease genes based on node2vec and autoencoder. Frontiers in genetics 10 (2019).Google Scholar
Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. 701--710.Google ScholarDigital Library
Dina A Ragab, Maha Sharkas, Stephen Marshall, and Jinchang Ren. 2019. Breast cancer detection using deep convolutional neural networks and support vector machines. PeerJ 7 (2019), e6201.Google ScholarCross Ref
Giulio Rossetti, Michele Berlingerio, and Fosca Giannotti. 2011. Scalable link prediction on multidimensional networks. In 2011 IEEE 11th International Conference on Data Mining Workshops. IEEE, 979--986.Google ScholarDigital Library
Tobias Schnabel, Igor Labutov, David Mimno, and Thorsten Joachims. 2015. Evaluation methods for unsupervised word embeddings. In Proceedings of the 2015 conference on empirical methods in natural language processing. 298--307.Google ScholarCross Ref
Li Shen, Laurie R Margolies, Joseph H Rothstein, Eugene Fluder, Russell McBride, and Weiva Sieh. 2019. Deep learning to improve breast cancer detection on screening mammography. Scientific reports 9, 1 (2019), 1--12.Google Scholar
Jennifer Y Sheng, Kala Visvanathan, Elissa Thorner, and Antonio C Wolff. 2019. Breast cancer survivorship care beyond local and systemic therapy. The Breast 48 (2019), S103-S109.Google ScholarCross Ref
Dongdong Wang, Nayden G Naydenov, Mikhail G Dozmorov, Jennifer E Koblinski, and Andrei I Ivanov. 2020. Anillin regulates breast cancer cell migration, growth, and metastasis by non-canonical mechanisms involving control of cell stemness and differentiation. Breast Cancer Research 22, 1 (2020), 1--19.Google ScholarCross Ref
Yang Yang, Nitesh Chawla, Yizhou Sun, and Jiawei Hani. 2012. Predicting links in multi-relational and heterogeneous networks. In 2012 IEEE 12th international conference on data mining. IEEE, 755--764.Google ScholarDigital Library
Zhao Yang, René Algesheimer, and Claudio J Tessone. 2016. A comparative analysis of community detection algorithms on artificial networks. Scientific reports 6 (2016), 30750.Google Scholar
Yongcheng Zhan, Ruoran Liu, Qiudan Li, Scott James Leischow, and Daniel Dajun Zeng. 2017. Identifying topics for e-cigarette user-generated contents: a case study from multiple social media platforms. Journal of medical Internet research 19, 1 (2017), e24.Google ScholarCross Ref
Enming Zhang. 2020. A Mixed-method Approach Towards the Understanding of Patient-generated Content on Social Media: A Case Study on Breast Cancer. Manuscript under review.Google Scholar
Shaodian Zhang, Edouard Grave, Elizabeth Sklar, and Noémie Elhadad. 2017. Longitudinal analysis of discussion topics in an online breast cancer community using convolutional neural networks. Journal of biomedical informatics 69 (2017), 1--9.Google ScholarDigital Library

Index Terms

Extracting Features from Online Forums to Meet Social Needs of Breast Cancer Patients

Recommendations

Survival of Breast Cancer Patients in Several Hospitals of Makassar City 2012-2016
ICHSM '18: Proceedings of the International Conference on Healthcare Service Management 2018

The aim of this study to determine survival probability of breast cancer patients with survival analysis and factors related to survival of breast cancer patients such as: tumor size, clinical stage, metastasis history, comorbidities, age and therapy ...
Read More
Survival patients with pulmonary metastases in breast cancer neoplasia
MCBC'09: Proceedings of the 10th WSEAS international conference on Mathematics and computers in biology and chemistry

Breast cancer is one of the most frequent neoplasia in women (27% of the total types of cancer). It represents the second cause of death in the USA after pulmonary cancer.

We conducted a survey from January 2000 to December 2005 on 120 patients admitted ...
Read More
Identifying Symptom Clusters in Breast Cancer and Colorectal Cancer Patients using EHR Data
BCB '19: Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics

Patients with chronic conditions such as breast cancer and colorectal cancer often present with different symptoms, such 'fatigue', 'pain' and 'depression'. These symptoms add to patients' distress and functional impairment if left untreated. In this ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

COMPASS '20: Proceedings of the 3rd ACM SIGCAS Conference on Computing and Sustainable Societies
June 2020
359 pages
ISBN:9781450371292
DOI:10.1145/3378393

Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 July 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Health informatics
Network embedding
Social computing
Text mining
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate25of50submissions,50%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 98
  Total Downloads
- Downloads (Last 12 months)6
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Extracting Features from Online Forums to Meet Social Needs of Breast Cancer Patients

COMPASS '20: Proceedings of the 3rd ACM SIGCAS Conference on Computing and Sustainable Societies

ABSTRACT

References

Cited By

Index Terms

Recommendations

Survival of Breast Cancer Patients in Several Hospitals of Makassar City 2012-2016

Survival patients with pulmonary metastases in breast cancer neoplasia

Identifying Symptom Clusters in Breast Cancer and Colorectal Cancer Patients using EHR Data

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Extracting Features from Online Forums to Meet Social Needs of Breast Cancer Patients

COMPASS '20: Proceedings of the 3rd ACM SIGCAS Conference on Computing and Sustainable Societies

ABSTRACT

References

Cited By

Index Terms

Recommendations

Survival of Breast Cancer Patients in Several Hospitals of Makassar City 2012-2016

Survival patients with pulmonary metastases in breast cancer neoplasia

Identifying Symptom Clusters in Breast Cancer and Colorectal Cancer Patients using EHR Data

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media