skip to main content
10.1145/3568562.3568564acmotherconferencesArticle/Chapter ViewAbstractPublication PagessoictConference Proceedingsconference-collections
research-article

COVID-19 Deep Clustering: An Ontology construction clustering method with dynamic medical labeling

Published: 01 December 2022 Publication History

Abstract

This paper introduces a novel clustering-based framework for COVID-19 ontology construction using Pubmed LitCovid scientific research articles data. Our study uses a semantic approach with hierarchical clustering to construct a more effective COVID-19 documents ontology with medical labeling and search. We believe this study may initiate a future development for an advanced COVID-19 domain-specific ontology. The significant contribution from this research addresses solving the limitations in manual classification tasks of the everyday fast-increasing number of scientific papers and the overloading of their unclassified knowledge. With this research, our provision will help scholars with a better search mechanism to retrieve highly relevant expert information about their favorite topics in the COVID-19-related literature. To our best knowledge, this approach is the first successful attempt to apply auto clustering with labeling and search on the COVID-19 research papers. Moreover, in text processing, we propose a systematical evaluation without dependence on standard data collection to evaluate our methodology.

References

[1]
Chakraborti, S. and Dey, S. 2016. Multi-level K-means text clustering technique for topic identification for competitor intelligence. Proceedings - International Conference on Research Challenges in Information Science (2016).
[2]
Chen, Q. 2019. BioSentVec: Creating sentence embeddings for biomedical texts. 2019 IEEE International Conference on Healthcare Informatics, ICHI 2019 (2019).
[3]
Chen, Q. 2021. LitCovid: An open database of COVID-19 literature. Nucleic Acids Research. 49, D1 (2021).
[4]
Demey, Y.T. and Golzio, D. 2020. Search strategies at the European Patent Office. World Patent Information. 63, (2020).
[5]
Ghane, M. 2020. Technology Forecasting Model Based on Trends of Engineering System Evolution (TESE) and Big Data for 4IR. 2020 IEEE Student Conference on Research and Development, SCOReD 2020 (2020).
[6]
Guan, R. 2020. Deep Feature-Based Text Clustering and Its Explanation. IEEE Transactions on Knowledge and Data Engineering. (2020).
[7]
Jun, S. 2014. Document clustering method using dimension reduction and support vector clustering to overcome sparseness. Expert Systems with Applications. 41, 7 (2014).
[8]
Mikolov, T. 2013. Distributed Representations of Words and Phrases and their Compositionality arXiv: 1310 . 4546v1 [ cs . CL ] 16 Oct 2013. arXiv preprint arXiv:1310.4546. cs.CL, (2013).
[9]
Mikolov, T. 2013. Efficient estimation of word representations in vector space. 1st International Conference on Learning Representations, ICLR 2013 - Workshop Track Proceedings (2013).
[10]
Moghadasi, M.N. and Zhuang, Y. 2020. Sent2Vec: A New Sentence Embedding Representation with Sentimental Semantic. Proceedings - 2020 IEEE International Conference on Big Data, Big Data 2020 (2020).
[11]
Onan, A. 2017. An improved ant algorithm with LDA-based representation for text document clustering. Journal of Information Science. 43, 2 (2017).
[12]
Phan, C.P. 2019. Ontology-based heuristic patent search. International Journal of Web Information Systems. 15, 3 (2019).
[13]
Sowa, J. 2000. Knowledge Representation: Logical, Philosophical, and Computational Foundations.
[14]
Stevens, J. 2020. Representing document-level semantics of biomedical literature using pre-trained embedding models. Proceedings of the 15th Conference on Natural Language Processing, KONVENS 2019 (2020).
[15]
Wei, C.H. 2019. PubTator central: automated concept annotation for biomedical full text articles. Nucleic Acids Research. 47, W1 (2019).
[16]
With, L. 2018. Under review as a conference paper at ICLR 2018 CLUSTERING WITH DEEP LEARNING: TAXONOMY AND NEW METHODS. Conference. (2018).
[17]
Yi, J. 2017. A Novel Text Clustering Approach Using Deep-Learning Vocabulary Network. Mathematical Problems in Engineering. 2017, (2017).
[18]
Zhang, Y. 2019. BioWordVec, improving biomedical word embeddings with subword information and MeSH. Scientific Data. 6, 1 (2019).

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
SoICT '22: Proceedings of the 11th International Symposium on Information and Communication Technology
December 2022
474 pages
ISBN:9781450397254
DOI:10.1145/3568562
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 December 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. BioBert clustering, document classification
  2. hierarchical data clustering

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

SoICT 2022

Acceptance Rates

Overall Acceptance Rate 147 of 318 submissions, 46%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 42
    Total Downloads
  • Downloads (Last 12 months)12
  • Downloads (Last 6 weeks)0
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media