Abstract
The co-occurrence analysis of medical subject heading (MeSH) terms in the bibliographic database is used in both bibliometrics and text mining fields. Because MeSH itself contains a hierarchical structure and MeSH descriptors represent different semantic types (i.e., disease, chemicals), the biclustering analysis of distinct semantic MeSH terms may be a novel approach for knowledge discovery. This study aimed to bicluster high-frequency MeSH terms based on their co-occurrence of distinct semantic types in a MeSH tree, so as to represent the structure (or status) of a scientific topic more specifically. The study was mainly comprised of four parts: construction of a MeSH term co-occurrence matrix of distinct semantic types, the biclustering algorithm, case study, and comparison. The first three parts were completed using R and the gCLUTO software. In the case study section, more specific knowledge models about the techniques and corresponding applications were discovered, proving that the method proposed in this study was valid and could be used in the universe of knowledge discovery.
Similar content being viewed by others
References
Barupal, D. K., Gao, B., Budczies, J., Phinney, B. S., Perroud, B., Denkert, C., et al. (2019). Prioritization of metabolic genes as novel therapeutic targets in estrogen-receptor negative breast tumors using multi-omics data and text mining. Oncotarget. https://doi.org/10.18632/oncotarget.26995.
Behura, S. K., & Severson, D. W. (2014). Bicluster pattern of codon context usages between flavivirus and vector mosquito Aedes aegypti: relevance to infection and transcriptional response of mosquito genes. Molecular Genetic and Genomics. https://doi.org/10.1007/s00438-014-0857-x.
Bian, J., Morid, M. A., Jonnalagadd, S., Luo, G., & Del Fiol, G. (2017). Automatic identification of high impact articles in pubMed to support clinical decision making. Journal of Biomedical Informatics. https://doi.org/10.1016/j.jbi.2017.07.015.
Chen, Y. C., Kuo, C. H., Cheng, C. M., & Wu, J. C. (2019). Recent advances in the management of cervical spondylotic myelopathy: bibliometric analysis and surgical perspectives. Journal of Neurosurgery. Spine. https://doi.org/10.3171/2019.5.SPINE18769.
Cheng, Y., & Church, G. M. (2000). Biclustering of expression data. Proceedings International Conference on Intelligent Systems for Molecular Biology, 8, 93–103.
Chisini, L. A., Collares, K., Bastos, J. L. D., Peres, K. G., Peres, M. A., Horta, B. L., et al. (2019). Skin color affect the replacement of amalgam for composite in posterior restorations: a birth-cohort study. Brazilian Oral Research. https://doi.org/10.1590/1807-3107bor-2019.vol33.0054.
Deftereos, S. N., Andronis, C., Friedla, E. J., Persidis, A., & Persidis, A. (2011). Drug repurposing and adverse event prediction using high-throughput literature analysis. Wiley Interdisciplinary Reviews. Systems Biology and Medicine. https://doi.org/10.1002/wsbm.147.
Dey, L., & Mukhopadhyay, A. (2019). Biclustering-based association rule mining approach for predicting cancer-associated protein interactions. IET Systems Biology. https://doi.org/10.1049/iet-syb.2019.0045.
Dietze, J., & Suh, D. (2019). Risk factors for poor surgical outcome of pediatric nasolacrimal duct obstruction. Journal of Pediatric Ophthalmology Strabismus. https://doi.org/10.3928/01913913-20190506-01.
Feng, C., Becker, B., Huang, W., Wu, X., Eickhoff, S. B., & Chen, T. (2018). Neural substrates of the emotion-word and emotional counting Stroop tasks in healthy and clinical populations: a meta-analysis of functional brain imaging studies. Neuroimage. https://doi.org/10.1016/j.neuroimage.2018.02.023.
Gambardella, G., & di Bernardo, D. (2019). A tool for visualization and analysis of single-cell RNA-Seq data based on text mining. Frontiers in Genetics. https://doi.org/10.3389/fgene.2019.00734.
Golusinski, P., Pazdrowski, J., Szewczyk, M., Pieńkowski, P., Majchrzak, E., Schneider, A., et al. (2017). Multivariate analysis as an advantageous approach for prediction of the adverse outcome in head and neck microvascular reconstructive surgery. American Journal of Otolaryngology. https://doi.org/10.1016/j.amjoto.2016.11.012.
Gu, D., Li, T., Wang, X., Yang, X., & Yu, Z. (2019). Visualizing the intellectual structure and evolution of electronic health and telemedicine research. International Journal of Medical Informatics. https://doi.org/10.1016/j.ijmedinf.2019.08.007.
Ibrahim, H., Saad, A., Abdo, A., & Sharaf Eldin, A. (2016). Mining association patterns of drug-interactions using post marketing FDA’s spontaneous reporting data. Journal of Biomedical Informatics. https://doi.org/10.1016/j.jbi.2016.02.009.
Islam, M. S., Hasan, M. M., Wang, X., Germack, H. D., & Noor-E-Alam, M. (2018). A systematic review on healthcare analytics: Application and theoretical perspective of data mining. Healthcare (Basel). https://doi.org/10.3390/healthcare6020054.
Karami, A., Ghasemi, M., Sen, S., Moraes, M. F., & Shah, V. (2019). Exploring diseases and syndromes in neurology case reports from 1955 to 2017 with text mining. Computers in Biology and Medicine. https://doi.org/10.1016/j.compbiomed.2019.04.008.
Karypis, G., & Han, E. H. (2000). Concept indexing: a fast dimensionality reduction algorithm with applications to document retrieval & categorization. Technical Report. Department of Computer Science, University of Minnesota. http://www.cs.umn.edu/˜karypis. Accessed 23 September 2019.
Kastrin, A., Rindflesch, T. C., & Hristovski, D. (2016). Link prediction on a network of co-occurring MeSH terms: Towards literature-based discovery. Methods of Information in Medicine. https://doi.org/10.3414/ME15-01-0108.
Kléma, J., Malinka, F., & Železný, F. (2017). Semantic biclustering for finding local, interpretable and predictive expression patterns. BMC Genomics. https://doi.org/10.1186/s12864-017-4132-5.
Krittanawong, C., Zhang, H., Wang, Z., Aydar, M., & Kitai, T. (2017). Artificial intelligence in precision cardiovascular medicine. Journal of the American College of Cardiology. https://doi.org/10.1016/j.jacc.2017.03.571.
Lim, S. S., Vos, T., Flaxman, A. D., et al. (2012). A comparative risk assessment of burden of disease and injury attributable to 67 risk factors and risk factor clusters in 21 regions, 1990–2010: a systematic analysis for the Global Burden of Disease Study 2010. Lancet. https://doi.org/10.1016/S0140-6736(12)61766-8.
Mortezagholi, A., Khosravizadeh, O., Menhaj, M. B., Shafigh, Y., & Kalhor, R. (2019). Make intelligent of gastric cancer diagnosis error in Qazvin’s Medical Centers: Using data mining method. Asian Pacific Journal of Cancer Prevention. https://doi.org/10.31557/APJCP.2019.20.9.2607.
Oto, E., Okutucu, S., Katircioglu-Öztürk, D., et al. (2017). Predictors of sinus rhythm after electrical cardioversion of atrial fibrillation: Results from a data mining project on the Flec-SL trial data set. Europace. https://doi.org/10.1093/europace/euw144.
Pio, G., Ceci, M., D’Elia, D., Loglisci, C., & Malerba, D. (2013). A novel biclustering algorithm for the discovery of meaningful biological correlations between microRNAs and their target genes. BMC Bioinformatics. https://doi.org/10.1186/1471-2105-14-S7-S8.
Pletscher-Frankild, S., Pallejà, A., Tsafou, K., Binder, J. X., & Jensen, L. J. (2015). DISEASES: Text mining and data integration of disease-gene associations. Methods. https://doi.org/10.1016/j.ymeth.2014.11.020.
Rasmussen, M., & Karypis, G. (2004) gCLUTO—an interactive clustering, visualization, and analysis system. Technical report. Karypis Lab. http://glaros.dtc.umn.edu/gkhome/node/174. Accessed 23 September 2019.
Salinas, A., González, G., & Manuel Ramos, J. (2016). Rheumatic fever and rheumatic heart disease: Collaboration patterns and research core Topics. Journal of Heart Valve Disease,25(5), 619–627.
Steinbach, M., Karypis, G., & Kumar, V. (2000) A comparison of document clustering techniques. Resource Document. Karypis Lab. 2000. http://glaros.dtc.umn.edu/gkhome/node/157. Accessed 23 September 2019.
Sui, M., & Cui, L. (2017). Constructing a gene-drug-adverse reactions network and inferring potential gene-adverse reactions associations using a text mining approach. Studies in Health Technology and Informatics,245, 531–535.
Trotta, R. L., Rao, A. D., Hermann, R. M., & Boltz, M. P. (2018). Development of a comprehensive geriatric assessment led by geriatric nurse consultants: a feasibility study. Journal of Gerontological Nursing. https://doi.org/10.3928/00989134-20181109-03.
Wang, M., Li, W., Tao, Y., & Zhao, L. (2019). Emerging trends and knowledge structure of epilepsy during pregnancy research for 2000–2018: a bibliometric analysis. PeerJ. https://doi.org/10.7717/peerj.7115.
Williams, A. M., Liu, Y., Regner, K. R., Jotterand, F., Liu, P., & Liang, M. (2018). Artificial intelligence, physiological genomics, and precision medicine. Physiological Genomics. https://doi.org/10.1152/physiolgenomics.00119.2017.
Zandonadi, F. S., Castañeda, Santa, Cruz, E., & Korvala, J. (2019). New SDC function prediction based on protein-protein interaction using bioinformatics tools. Computational Biology and Chemistry. https://doi.org/10.1016/j.compbiolchem.2019.107087.
Zhang, Y. Q., & Leng, F. H. (2007). Study on text mining based on knowledge discovery in non-related literature. Information Studies: Theory & Practice,30(02), 194–197.
Zhang, Y. Q., & Leng, F. H. (2009). The theoretical basis of non-related literature knowledge discovery. Journal of Library Science in China,35(04), 25–30.
Zhao, Y., & Karypis, G. (2004). Criterion functions for document clustering: Experiments and analysis. Machine Learning,55, 311–331.
Zhao, F., Shi, B., Liu, R., Zhou, W., Shi, D., & Zhang, J. (2018). Theme trends and knowledge structure on choroidal neovascularization: a quantitative and co-word analysis. BMC Ophthalmology. https://doi.org/10.1186/s12886-018-0752-z.
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Material preparation, data collection, and analysis were performed by Li Fang and Xiaobei Zhou. The first draft of the manuscript was written by Li Fang, and all authors commented on the previous versions of the manuscript. All authors read and approved the final manuscript. We thank International Science Editing (http://www.internationalscienceediting.com) for editing this manuscript.
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Fang, L., Zhou, X. & Cui, L. Biclustering high-frequency MeSH terms based on the co-occurrence of distinct semantic types in a MeSH tree. Scientometrics 124, 1179–1190 (2020). https://doi.org/10.1007/s11192-020-03496-4
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-020-03496-4