Abstract
Literature research requires an understanding of the similarities and differences between different types of journals. It has not yet been possible to use text-mining to demonstrate the differences between the topics of articles by presenting features of article keywords using forest plots. It is important for authors to make a quick assessment of the similarities and differences between research types when submitting an article for publication in a journal. Our study uses text mining and forest plotting techniques to extract article features and compare the similarities and differences between the two journals' research types. There were a total of 100 top-cited articles selected from Spine (Phila Pa 1976) and The Spine Journal: official journals of the North American Spine Society with impact factors of 3.19 and 3.22 respectively, as reported by Journal Citation Reports (JCR) for 2018. XLSTAT software was used to extract features from author-made keywords and medical subject headings (e.g., MeSH terms in PubMed). These 200 top-cited articles were analyzed and clustered by performing factor analysis and social network analysis (SNA). The study presented three types of results: (1) descriptive statistics, (2) classification analysis, and (3) inferential statistics. The chi-square test was used to examine the frequency of clusters and journals, and forest plots were used to analyze differences between journals in terms of research topics. It was observed that (1) the United States dominated publications, accounting for 54% of 200 articles; the MeSH term of surgery was simultaneously highlighted in both journals using a word cloud generator; (2) five-term clusters were identified, namely, (i) Pain & Prognosis, (ii) Statistics & Data, (iii) Spine & Surgery, (iv) physiopathology, and (v) physiology; (4) there were no differences in distribution counts among categories between journals (Chi Square = 1.64, df = 4, p = 0.82), but differences in category(factor) scores between journals were found(Q-statistic = 484.94, df = 4, p < 0.001). Using text mining and a forest plot, we are able to understand the relationships between the types of research in different journals. Readers can use this research as a reference for future journal submissions based on the study results.
Similar content being viewed by others
Data availability
All data used in this study are available in the Online Appendices.
Abbreviations
- ASD:
-
Adult spinal deformity
- EFA:
-
Exploratory factor analysis
- JCR:
-
Journal citation reports
- MeSH:
-
Medical subject headings
- RT:
-
Research topics
- SNA:
-
Social network analysis
- SMD:
-
Standard mean the difference
References
Aronson, A. R., & Lang, F. (2010). An overview of MetaMap: Historical perspective and recent advances. Journal of the American Medical Informatics Association, 17(3), 229–236.
Bastian, M., Heymann, S., Jacomy, M. (2009). Gephi: An open source software for exploring and manipulating networks. In International AAAI Conference on Weblogs and Social Media.
Bates, D. W., Saria, S., Ohno-Machado, L., Shah, A., & Escobar, G. (2014). Big data in health care: Using analytics to identify and manage high-risk and high-cost patients. Health Affairs, 33(7), 1123–1131.
Borgatti, S. P., Everett, M. G., & Freeman, L. C. (2002). Ucinet for Windows: Software for social network analysis. Analytic Technologies.
Carragee, E. J., Hurwitz, E. L., & Weiner, B. K. (2011). A critical review of recombinant human bone morphogenetic protein-2 trials in spinal surgery: Emerging safety concerns and lessons learned. Spine, 11(6), 471–491. https://doi.org/10.1016/j.spinee.2011.04.023
Chien, T. W. (2020a). Meta-analysis in PubMed. Retrieved 14 June, 2020, from https://pubmed.ncbi.nlm.nih.gov/?term=Meta-analysis%5BMeSH+Major+Topic%5D
Chien, T. W. (2020b). The 100 Top-Cited Articles in Both Journals of Spine, and Spine j. Retrieved 11 December, 2020, from http://www.healthup.org.tw/html100/spine2journals.htm
Chien, T. W. (2022a). Five Clusters were Separated in this Study. Retrieved 10 July, 2020, from http://www.healthup.org.tw/gps/spineterm2020.htm
Chien, T. W. (2022d). The forest plot on Google Maps. Retrieved 10 July, 2020, from http://www.healthup.org.tw/gps/spine2journalscomp.htm
Chien, T. W. (2022b). The Sankey Diagram on Google Maps. Retrieved 10 July, 2020, from http://www.healthup.org.tw/aif/aif.asp?mname=spine2207scieto&width=2600&height=1600
Chien, T. W. (2022c). The Sankey diagram2 on Google Maps. Retrieved 10 July, 2020, from http://www.healthup.org.tw/aif/aif.asp?mname=spine2207scieto2&width=2600&height=1600
Chien, T. W., Chang, Y., & Wang, H. Y. (2018). Understanding the productive author who published papers in medicine using National Health Insurance Database: A systematic review and meta-analysis. Medicine, 97(8), e9967. https://doi.org/10.1097/MD.0000000000009967
Chien, T. W., Wang, H. Y., Kan, W. C., & Su, S. B. (2019). Whether article types of a scholarly journal are different in cited metrics using cluster analysis of MeSH terms to display: A bibliometric analysis. Medicine, 98(43), e17631. https://doi.org/10.1097/MD.0000000000017631
Cyrus, J. W., Santen, S. A., Merritt, C., Munzer, B. W., Peterson, W. J., Shockley, J., & Love, J. N. (2020). A social network analysis of the Western Journal of emergency medicine special issue in educational research and practice. Western Journal of Emergency Medicine, 21(6), 242–248. https://doi.org/10.5811/westjem.2020.7.46958
de Nooy, W., Mrvar, A., & Batagelj, V. (2011). Exploratory social network analysis with Pajek: Revised and expanded (2nd ed.). Cambridge University Press.
Freiman, J. A., Chalmers, T. C., Smith, H., & Kuebler, R. R. (1978). The importance of beta, the type II error and sample size in the design and interpretation of the randomized control trial: survey of 71 “negative trials.” New England Journal of Medicine, 299, 690–694.
Han, Y., Wennersten, S. A., & Lam, M. P. Y. (2019). Working the literature harder: What can text mining and bibliometric analysis reveal? Expert Review of Proteomics, 16(11–12), 871–873. https://doi.org/10.1080/14789450.2019.1703678
Hu, Y., Yu, Z., Chen, X., Luo, Y., & Wen, C. (2020). A bibliometric analysis and visualization of medical data mining research. Medicine, 99(22), e20338. https://doi.org/10.1097/MD.0000000000020338
Huang, H. L., Hong, S. H., & Tsai, Y. C. (2020). Approaches to text mining for analyzing treatment plan of quit smoking with free-text medical records: A PRISMA-compliant meta-analysis. Medicine, 99(29), e20999. https://doi.org/10.1097/MD.0000000000020999
Journal Spine. (2020). Spine (Phila Pa 1976). Retrieved 11 December, 2020, from https://www.ncbi.nlm.nih.gov/nlmcatalog?sort=date&term=%22Spine+(Phila+Pa+1976)%22[Title+Abbreviation]
Kostoff, R. N., Buchtel, H. A., Andrews, J., & Pfeil, K. M. (2005). The hidden structure of neuropsychology: Text mining of the journal Cortex: 1991–2001. Cortex, 41(2), 103–115. https://doi.org/10.1016/s0010-9452(08)70885-2
Kuo, Y. C., Chien, T. W., Kuo, S. C., Yeh, Y. T., Lin, J. J., & Fong, Y. (2020). Predicting article citations using data of 100 top-cited publications in the journal Medicine since 2011: A bibliometric analysis. Medicine, 99(44), e22885. https://doi.org/10.1097/MD.0000000000022885
Lalkhen, A. G. (2008). Statistics V: Introduction to clinical trials and systematic reviews. Continuing Education in Anesthesia Critical Care & Pain, 8(4), 143–146.
Lee, Y. L., Chien, T. W., & Wang, J. C. (2022). Using Sankey diagrams to explore the trend of article citations in the field of bladder cancer: Research achievements in China higher than those in the United States. Medicine, 101(34), e30217.
Lewis, J. A., & Ellis, S. H. (1982). A statistical appraisal of postinfarction beta-blocker trials. Primary Cardiology, 1, 31–37.
Lewis, S., & Clarke, M. (2001). Forest plots: Trying to see the wood and the trees. BMJ, 322(7300), 1479–1480. https://doi.org/10.1136/bmj.322.7300.1479
Lin, J. K., Chien, T. W., Yeh, Y. T., Ho, S. Y., & Chou, W. (2022). Using sentiment analysis to identify similarities and differences in research topics and medical subject headings (MeSH terms) between Medicine (Baltimore) and the Journal of the Formosan Medical Association (JFMA) in 2020: A bibliometric study. Medicine, 101(11), e29029.
Liu, M. Y., Chou, W., Chien, T. W., Kuo, S. C., Yeh, Y. T., & Chou, P. H. (2020). Evaluating the research domain and achievement for a productive researcher who published 114 sole-author articles: A bibliometric analysis. Medicine, 99(21), e20334. https://doi.org/10.1097/MD.0000000000020334
Liu, P. C., Lu, Y., Lin, H. H., Yao, Y. C., Wang, S. T., Chang, M. C., Chien, T. W., & Chou, P. H. (2022). Classification and citation analysis of the 100 top-cited articles on adult spinal deformity since 2011: A bibliometric analysis. Journal of the Chinese Medical Association, 85(3), 401–408.
Liu, X., Bollen, J., Nelson, M. L., & Van de Sompel, H. (2005). Coauthorship networks in the digital library research community. Information Processing and Management, 41(6), 1462–1480.
Meo, S. A., & Eldawlatly, A. A. (2019). Pathophysiology of a scientific paper. Review Saudi Journal of Anaesthesia, 13(Suppl 1), S9–S11.
Mullins, C. H., Boyd, C. J., & Lindeman, B. (2020). Factors associated with the highest and lowest cited research articles in general surgery journals. Journal of Surgical Research, 250, 39–44.
Norris, M., & Lecavalier, L. (2010). Evaluating the use of exploratory factor analysis in developmental disability psychological research. Journal of Autism and Developmental Disorders, 40(1), 8–20. https://doi.org/10.1007/s10803-009-0816-2
Osareh, F., Khademi, R., Rostami, M. K., & Shirazi, M. S. (2014). Coauthorship network structure analysis of Iranian researchers’ scientific outputs from 1991 to 2013 based on the social science citation index (SSCI). COLLNET Journal of Scientometrics and Information Management, 8(2), 263–271.
Oska, S., Lerma, E., & Topf, J. (2020). A picture is worth a thousand views: A triple crossover trial of visual abstracts to examine their impact on research dissemination. Journal of Medical Internet Research, 22(12), e22327. https://doi.org/10.2196/22327
PMC. (2020). Over 9098 Articles for Journals of Spine and Spine J in PMC. Retrieved 12 December, 2020, from https://pubmed.ncbi.nlm.nih.gov/?term=%28%28%28%22spine%22%5BJournal%5D%29+or+%28%22The+spine+journal+%3A+official+journal+of+the+North+American+Spine+Society%22%5BJournal%5D%29%29%29+and+%28%28%222011%22%5BDate+-+Publication%5D+%3A+%223000%22%5BDate+-+Publication%5D%29%29
Provost, F., & Fawcett, T. (2013). Data science and its relationship to big data and data-driven decision making. Big Data, 1(1), 51–59.
Pubmed. (2022). Over 311 Articles with 100 Top-Cited in Title Indexed in Pubmed. Retrieved 10 July, 2022, from https://pubmed.ncbi.nlm.nih.gov/?term=100%5Btotle%5D+and+cited%5Btitle%5D&sort=date
Rajaee, S. S., Bae, H. W., Kanim, L. E., & Delamarter, R. B. (2012). Spinal fusion in the United States: Analysis of trends from 1998 to 2008. Spine, 37(1), 67–76. https://doi.org/10.1097/BRS.0b013e31820cccfb
Riba, M., Sala, C., Toniolo, D., & Tonon, G. (2019). Big data in medicine, the present and hopefully the future. Frontiers in Medicine, 6, 263.
Sadoughi, F., Valinejadi, A., Shirazi, M. S., & Khademi, R. (2016). Social network analysis ofIranian researchers on medical parasitology: A 41 year coauthorship survey. Iranian Journal of Parasitology, 11(2), 204–212.
Schwab, F., Ungar, B., Blondel, B., Buchowski, J., Coe, J., Deinlein, D., DeWald, C., Mehdian, H., Shaffrey, C., Tribus, C., & Lafage, V. (2012). Scoliosis research society-Schwab adult spinal deformity classification: A validation study. Spine, 37(12), 1077–1082. https://doi.org/10.1097/BRS.0b013e31823e15e2
Sedgwick, P. (2015). How to read a forest plot in a meta-analysis. BMJ, 351, h4028.
Shan, G., Lu, Y., Min, B., Qu, W., & Zhang, C. (2016). A MeSH-based text mining method for identifying novel prebiotics. Medicine, 95(49), e5585. https://doi.org/10.1097/MD.0000000000005585
Stephenson, J. (2017). Explaining the forest plot in meta-analyses. Journal of Wound Care, 26(11), 611–612.
Sung, S. F., Hsieh, C. Y., & Hu, Y. H. (2020). Two decades of research using taiwan’s national health insurance claims data: Bibliometric and text mining analysis on PubMed. Journal of Medical Internet Research, 22(6), e18457. https://doi.org/10.2196/18457
The Spin Journal. (2020). The spine journal: Official journal of the North American Spine Society. Retrieved 11 December, 2020, from https://www.ncbi.nlm.nih.gov/nlmcatalog?sort=date&term=%22Spine+J%22[Title+Abbreviation]
Wang, L. Y., Chien, T. W., Lin, J. K., & Chou, W. (2022). Vaccination associated with gross domestic product and fewer deaths in countries and regions: A verification study. Medicine, 101(4), e28619.
Weintraub, W. S. (2019). Role of big data in cardiovascular research. Journal of the American Heart Association, 8(14), e012791.
Whitney, E., Mahato, D., Odell, T., Khan, Y. R., & Siddiqi, J. (2019). The 100-most cited articles about craniectomy and hemicraniectomy: A bibliometric analysis. Cureus, 11(8), e5524.
Wiki. (2020). Document-Term Matric and Finding Topics. Retrieved 11 December, 2020, from https://en.wikipedia.org/wiki/Document-term_matrix
Wu, J. W., Chien, T. W., Tsai, Y. C., Wang, H. Y., Kan, W. C., & Wang, L. Y. (2022). Using the forest plot to compare citation achievements in bibliographic and meta-analysis studies since 2011 using data on PubMed Central: A retrospective study. Medicine, 101(27), e29213.
Wu, Y., Dang, M., Li, H., Jin, X., & Yang, W. (2019). Identification of genes related to mental disorders by text mining. Medicine, 98(42), e17504. https://doi.org/10.1097/MD.0000000000017504
XLSTAT. (2020). The Topic of Text Mining. Retrieved 11 December, 2020, from https://help.xlstat.com/s/topic/0TO1p000000VCHnGAO/text-mining?language=en_US&tabset-ed7aa=2
XLSTAT. (2022). Retrieved 10 July, 2022, from https://help.xlstat.com/6751-feature-extraction-tutorial-excel. https://help.xlstat.com/6751-feature-extraction-tutorial-excel
Yang, T. Y., Chen, C. H., Chien, T. W., & Lai, F. J. (2021). Predicting the number of article citations on the topic of pemphigus vulgaris with the 100 top-cited articles since 2011: A protocol for systematic review and meta-analysis. Medicine, 100(31), e26806.
Yeh, C. H., Chien, T. W., & Chou, P. H. (2022). Citation analysis of the 100 top-cited articles on discectomy via endoscopy since 2011 using alluvial diagrams: Bibliometric analysis. European Journal of Medical Research, 27(1), 169.
Yie, K. Y., Chien, T. W., Yeh, Y. T., Chou, W., & Su, S. B. (2021). Using social network analysis to identify spatiotemporal spread patterns of COVID-19 around the world: Online dashboard development. International Journal of Environmental Research and Public Health, 18(5), 2461.
Zhou, W., Shao, F., & Li, J. (2019). Bioinformatic analysis of the molecular mechanism underlying bronchial pulmonary dysplasia using a text mining approach. Medicine, 98(52), e18493. https://doi.org/10.1097/MD.0000000000018493
Acknowledgements
We thank Enago (www.enago.tw) for the English language review of this manuscript.
Author information
Authors and Affiliations
Contributions
PHC initiated the research. JCJL collected data and conducted the analysis. PHC wrote the manuscript. TWC contributed to the study’s design and provided critical reviews of the manuscript, and TWC contributed to the interpretation of the results.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
All data were downloaded from the MEDLINE database at pubmed.com.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Chou, PH., Lin, JC.J. & Chien, TW. Using text mining and forest plots to identify similarities and differences between two spine-related journals based on medical subject headings (MeSH terms) and author-specified keywords in 100 top-cited articles. Scientometrics 128, 1–17 (2023). https://doi.org/10.1007/s11192-022-04549-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-022-04549-6