Research status and trend analysis of global biomedical text mining studies in recent 10 years

Zhai, Xing; Li, Zhihong; Gao, Kuo; Huang, Youliang; Lin, Lin; Wang, Le

doi:10.1007/s11192-015-1700-9

Research status and trend analysis of global biomedical text mining studies in recent 10 years

Published: 28 August 2015

Volume 105, pages 509–523, (2015)
Cite this article

Scientometrics Aims and scope Submit manuscript

Xing Zhai¹,
Zhihong Li²,
Kuo Gao¹,
Youliang Huang¹,
Lin Lin³ &
…
Le Wang⁴

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

Objective

In recent years, with the abrupt growth of the amount of biomedical literature, a lot of implicit laws and new knowledge were buried in the vast literature, while the text mining technology, if applied in the biomedical field, can integrate and analyze massive biomedical literature data, obtaining valuable information to improve people’s understanding of biomedical phenomena. This paper mainly discussed the research status of text mining technology applied in the biomedical field in recent 10 years in order to provide a reference for further studies of other researchers.

Methods

Biomedical text mining literature included in SCI from 2004 to 2013 were retrieved and filtered and then were analyzed from the perspectives of annual changes, regional distribution, research institutions, journals sources, research fields, keywords and so on.

Results

The total amount of global biomedical text mining literature is on the rise, among which literature relevant to named entity recognition, entity relation extraction, text categorization, text clustering, abbreviations extraction and co-occurrence analysis take up a large percentage; studies in USA and the UK are in the leading position.

Conclusion

Compared with other much more mature research topics, the application of text mining technology in biomedicine is still a relatively new research field worldwide, while with the constantly improving awareness of this field and deepening researches in this area, a number of core research areas, core research institutes and core research fields have been formed in this field. Therefore, further researches of this field will inject new vitality in the development of biomedicine.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Bayer, A. E., & Folger, J. (1966). Some correlates of a citation measure of productivity in science. Sociology of education, 39, 381–390.
Article Google Scholar
Braun, T., Schubert, A. P., & Kostoff, R. N. (2000). Growth and trends of fullerene research as reflected in its journal literature. Chemical Reviews, 100(1), 23–38.
Article Google Scholar
de Solla Price, D. J., & Beaver, D. (1966). Collaboration in an invisible college. American Psychologist, 21(11), 1011.
Article Google Scholar
Donaldson, I., Martin, J., De Bruijn, B., Wolting, C., Lay, V., Tuekam, B., & Hogue, C. W. (2003). PreBIND and Textomy–mining the biomedical literature for protein–protein interactions using a support vector machine. BMC bioinformatics, 4(1), 11.
Article Google Scholar
Fleuren, W. W., Verhoeven, S., Frijters, R., Heupers, B., Polman, J., van Schaik, R., & Alkema, W. (2011). CoPub update: CoPub 5.0 a text mining system to answer biological questions. Nucleic Acids Research, 39, 450–454.
Article Google Scholar
Frijters, R., Heupers, B., van Beek, P., Bouwhuis, M., van Schaik, R., de Vlieg, J., & Alkema, W. (2008). CoPub: A literature-based keyword enrichment tool for microarray data analysis. Nucleic Acids Research, 36, 406–410.
Article Google Scholar
Han, J. S., & Ho, Y. S. (2011). Global trends and performances of acupuncture research. Neuroscience and Biobehavioral Reviews, 35(3), 680–687.
Article Google Scholar
He, M., Wang, Y., & Li, W. (2009). PPI finder: A mining tool for human protein–protein interactions. PLoS One, 4(2), e4554.
Article Google Scholar
Hirsch, J. E. (2005). An index to quantify an individual’s scientific research output. Proceedings of the National Academy of Sciences of the United States of America, 102(46), 16569–16572.
Article Google Scholar
Hirsch, J. E. (2007). Does the h index have predictive power? Proceedings of the National Academy of Sciences, 104(49), 19193–19198.
Article Google Scholar
Hu, X. (2004). Integration of cluster ensemble and text summarization for gene expression analysis. In Proceedings of fourth IEEE symposium on bioinformatics and bioengineering, 2004. BIBE 2004 (pp 251–258). IEEE.
Hur, J., Schuyler, A. D., & Feldman, E. L. (2009). SciMiner: Web-based literature mining tool for target identification and functional enrichment analysis. Bioinformatics, 25(6), 838–840.
Article Google Scholar
Kinney, A. L. (2007). National scientific facilities and their science impact on nonbiomedical research. Proceedings of the National Academy of Sciences, 104(46), 17943–17947.
Article Google Scholar
Krallinger, M., Leitner, F., Rodriguez-Penagos, C., & Valencia, A. (2008). Overview of the protein–protein interaction annotation extraction task of BioCreative II. Genome Biology, 9(Suppl 2), S4.
Article Google Scholar
Leung, S., Chan, K., & Song, L. (2006). Publishing trends in Chinese medicine and related subjects documented in WorldCat. Health Information and Libraries Journal, 23(1), 13–22.
Article Google Scholar
Li, L. L., Ding, G., Feng, N., Wang, M. H., & Ho, Y. S. (2009). Global stem cell research trend: Bibliometric analysis as a tool for mapping of trends from 1991 to 2006. Scientometrics, 80(1), 39–58.
Article Google Scholar
Li, T., Ho, Y. S., & Li, C. Y. (2008). Bibliometric analysis on global Parkinson’s disease research trends during 1991–2006. Neuroscience Letters, 441(3), 248–252.
Article Google Scholar
Li, C., Zhang, Y., & Gao, Z. (1999). A new clustering algorithm. Journal of Pattern Recognition and Artificial Intelligence, 12(2), 205–209.
Google Scholar
Liu, H., Hu, Z. Z., Torii, M., Wu, C., & Friedman, C. (2006). Quantitative assessment of dictionary-based protein named entity tagging. Journal of the American Medical Informatics Association, 13(5), 497–507.
Article MATH Google Scholar
Liu, X., & Wang, Z. (2010). Statistics and analysis of the high-cited papers of information science research from 2004 to 2008. Journal of Intelligence, 29(1), 64–67.
Google Scholar
Lv, T., & Jiang, Y. (2010). Application of text mining in biomedical field. The Chinese Medicine Books Intelligence Magazine, 19(4), 56–64.
Google Scholar
Macias-Chapula, C. A. (2000). AIDS in Haiti: A bibliometric analysis. Bulletin of the Medical Library Association, 88(1), 56.
Google Scholar
Miwa, M., Sætre, R., Miyao, Y., & Tsujii, J. I. (2009). Protein–protein interaction extraction by leveraging multiple kernels and parsers. International Journal of Medical Informatics, 78(12), e39–e46.
Article Google Scholar
Muller, H., & Mancuso, F. (2008). Identification and analysis of co-occurrence networks with NetCutter. PLoS One, 3(9), e3178.
Article Google Scholar
Perez-Iratxeta, C., Bork, P., & Andrade, M. A. (2002). Association of genes to genetically inherited diseases using data mining. Nature Genetics, 31(3), 316–319.
Google Scholar
Ramos, J. M., Padilla, S., Masia, M., & Gutierrez, F. (2008). A bibliometric analysis of tuberculosis research indexed in PubMed, 1997–2006. The International Journal of Tuberculosis and Lung Disease, 12(12), 1461–1468.
Google Scholar
Rodriguez-Esteban, R. (2009). Biomedical text mining and its applications. PLoS Computational Biology, 5(12), e1000597.
Article Google Scholar
Saha, S. K., Sarkar, S., & Mitra, P. (2009). Feature selection techniques for maximum entropy based biomedical named entity recognition. Journal of Biomedical Informatics, 42(5), 905–911.
Article Google Scholar
Schwartz, A. S., & Hearst, M. A. (2003). A simple algorithm for identifying abbreviation definitions in biomedical text. In Pacific Symposium on Biocomputing (Vol. 8, pp. 451–462).
Si, L., & Kanungo, T. (2005). Thresholding strategies for text classifiers: TREC 2005 Biomedical Triage Task Experiments. In TREC.
Smalheiser, N. R., & Swanson, D. R. (1994). Assessing a gap in the biomedical literature-magnesium-deficiency and neurologic disease. Neuroscience Research Communications, 15(1), 1–9.
Google Scholar
Smith, L., Rindflesch, T., & Wilbur, W. J. (2004). MedPost: A part-of-speech tagger for bioMedical text. Bioinformatics, 20(14), 2320–2321.
Article Google Scholar
Sorensen, A. A. (2009). Alzheimer’s disease research: Scientific productivity and impact of the top 100 investigators in the field. Journal of Alzheimer’s Disease, 16(3), 451.
Google Scholar
Tari, L., Anwar, S., Liang, S., Cai, J., & Baral, C. (2010). Discovering drug–drug interactions: A text-mining and reasoning approach based on properties of drug metabolism. Bioinformatics, 26(18), 1547–1553.
Article Google Scholar
Theodosiou, T., Darzentas, N., Angelis, L., & Ouzounis, C. A. (2008). PuReD-MCL: A graph-based PubMed document clustering methodology. Bioinformatics, 24(17), 1935–1941.
Article Google Scholar
Tsuruoka, Y., Miwa, M., Hamamoto, K., Tsujii, J. I., & Ananiadou, S. (2011). Discovering and visualizing indirect associations between biomedical concepts. Bioinformatics, 27(13), i111–i119.
Article Google Scholar
Tsuruoka, Y., Tateishi, Y., Kim, J. D., Ohta, T., McNaught, J., Ananiadou, S., & Tsujii, J. I. (2005). Developing a robust part-of-speech tagger for biomedical text. Advances in Informatics, 3746, 382–392.
Article Google Scholar
Tsuruoka, Y., Tsujii, J. I., & Ananiadou, S. (2008). FACTA: A text search engine for finding associated biomedical concepts. Bioinformatics, 24(21), 2559–2560.
Article Google Scholar
Tulipano, P. K., Tao, Y., Millar, W. S., Zanzonico, P., Kolbert, K., Xu, H., & Friedman, C. (2007). Natural language processing and visualization in the molecular imaging domain. Journal of Biomedical Informatics, 40(3), 270–281.
Article Google Scholar
Ugolini, D., Puntoni, R., Perera, F. P., Schulte, P. A., & Bonassi, S. (2007). A bibliometric analysis of scientific production in cancer molecular epidemiology. Carcinogenesis, 28(8), 1774–1779.
Article Google Scholar
Wang, H., & Zhao, T. (2008). Research and development of biomedical text mining. Journal of Chinese Information Processing, 22(3), 89–98.
MATH Google Scholar
Xie, S., Zhang, J., & Ho, Y. S. (2008). Assessment of world aerosol research trends by bibliometric analysis. Scientometrics, 77(1), 113–130.
Article Google Scholar
Zhang, H. Q., He, D. G., He, L., & Li, J. (1997). The literature of Qigong: Publication patterns and subject headings. International Forum on Information and Documentation, 22(3), 38–44.
Google Scholar

Download references

Acknowledgments

This research is supported by Young Talent Project of Beijing (No. YETP0821) and Research Project for Practice Development of National TCM Clinical Research Bases.

Author information

Authors and Affiliations

Beijing University of Chinese Medicine, Beijing, 100029, China
Xing Zhai, Kuo Gao & Youliang Huang
Dongzhimen Hospital, Beijing University of Chinese Medicine, Beijing, 100700, China
Zhihong Li
Knowledge and Action College, HuBei University, Wuhan, 430011, China
Lin Lin
Dongfang Hospital, Beijing University of Chinese Medicine, Beijing, 100078, China
Le Wang

Authors

Xing Zhai
View author publications
You can also search for this author in PubMed Google Scholar
Zhihong Li
View author publications
You can also search for this author in PubMed Google Scholar
Kuo Gao
View author publications
You can also search for this author in PubMed Google Scholar
Youliang Huang
View author publications
You can also search for this author in PubMed Google Scholar
Lin Lin
View author publications
You can also search for this author in PubMed Google Scholar
Le Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Le Wang.

Additional information

Xing Zhai, Zhihong Li and Kuo Gao have contributed equally to this work.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOCX 27 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhai, X., Li, Z., Gao, K. et al. Research status and trend analysis of global biomedical text mining studies in recent 10 years. Scientometrics 105, 509–523 (2015). https://doi.org/10.1007/s11192-015-1700-9

Download citation

Received: 21 July 2015
Published: 28 August 2015
Issue Date: October 2015
DOI: https://doi.org/10.1007/s11192-015-1700-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Research status and trend analysis of global biomedical text mining studies in recent 10 years

Abstract

Objective

Methods

Results

Conclusion

Access this article

Subscribe and save

Buy Now

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Supplementary material 1 (DOCX 27 kb)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation