Skip to main content
Log in

Indexing important drugs from medical literature

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

Health maintenance is one of the foremost pillars of human society which needs up-to-date solutions to medical problems. The advancement in the biomedical field has intensified the—information load that exists in the form of clinic reports, research papers, or lab tests, etc. Extracting meaningful insights from this corpus is equally important as its progress—to make it valuable for recent medicine. In terms of biomedical text mining, the areas explored include protein–protein interactions, entity-relationship detection, and so on. The biomedical effects of drugs have significance when administered to a living organism. Biomedical literature is not widely explored in terms of gene-drug relations, hence needs investigation. Indexing methods can be used for ranking gene-drug relations. In scientific literature, Hirsch’s the h-index is usually used to quantify the impact of an individual author. Likewise, in this research, we propose the Drug-Index, a quantifiable measure that can be used to detect gene-drug relations. It is useful in drug discovery, diagnosing, personalized treatment using suitable drugs for relevant genes. For a strong and reliable gene-drug relationship discovery, drugs are extracted from a subset of MEDLINE—a bibliographic medical database. The detected drugs are verified from the PharmacoGenomics KnowledgeBase (PharmGKB)—a publicly available medical knowledgebase by Stanford University.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. https://www.nlm.nih.gov/pubs/factsheets/medline.html

  2. https://pubmed.ncbi.nlm.nih.gov

  3. https://www.nlm.nih.gov/medline/medline_overview.html

  4. http://www.nlm.nih.gov/research/umls/

  5. https://www.pharmgkb.org/

  6. http://www.drugbank.ca/

  7. http://www.accessdata.fda.gov/scripts/cder/ndc/

  8. http://www.nlm.nih.gov/research/umls/knowledge_sources/metathesaurus/index.html.

  9. http://www.proteininformationresource.org/iprolink/biothesaurus/

  10. https://en.wikipedia.org/wiki/Adenosine_triphosphate

  11. https://en.wikipedia.org/wiki/Adenosine_triphosphate

References

  • Alasbahi, R. H., & Melzig, M. F. (2012). Forskolin and derivatives as tools for studying the role of cAMP. Die Pharmazie-An International Journal of Pharmaceutical Sciences, 67(1), 5–13.

  • An-Bing, Z., Hui-Hua, Y., Xipeng, P., Li-Hui, Y., & Yan-chun, F. (2020). On-site identification of counterfeit drugs based on near-infrared spectroscopy Siamese-network modeling. IEEE Access. https://doi.org/10.1109/ACCESS.2020.3047683

    Article  Google Scholar 

  • Aronson, A. R. (2001). Effective mapping of Biomedical Text to the UMLS Metathesaurus: The Metamap Program. In Proceedings of the AMIA symposium, (pp. 17–21).

  • Aronson, A. R., & Lang, F.-M. (2010). An overview of MetaMap: historical perspective and recent advances. JAMIA: A Scholarly Journal of Informatics in Health and Biomedicine, 17, 229–236.

    Google Scholar 

  • Bahat, H. S., Takasaki, H., Chen, X., Bet-Or, Y., & Treleaven, J. (2015). Cervical kinematic training with and without interactive VR training for chronic neck pain–a randomized clinical trial. Manual therapy, 20(1), 68–78.

  • Baumgartner, W. A., Jr., Cohen, K. B., Fox, L. M., Acquaah-Mensah, G., & Hunter, L. (2007). Manual curation is not sufficient for annotation of genomic databases. Bioinformatics, 23, 41–48.

    Article  Google Scholar 

  • Blakey, J. D., & Hall, I. P. (2011). Current progress in pharmacogenomics. British Journal of Clinical Pharmacology, 71, 824–836.

    Article  Google Scholar 

  • Bodenreider, O. (2004). The Unified Medical Langauge System (UMLS): Integrating biomedical terminology. Nucleic Acids Research, 32, D267–D270.

    Article  Google Scholar 

  • Braat, H., Rottiers, P., Hommes, D. W., Huyghebaert, N., Remaut, E., Remon, J. P., et al. (2006). A phase I trial with transgenic bacteria expressing interleukin-10 in Crohn’s disease. Clinical gastroenterology and hepatology, 4(6), 754–759.

  • Chen, Q., & Pan, G. (2021). A structure-self-organizing DBN for image recognition. Neural Computing and Applications, 33, 877–886. https://doi.org/10.1007/s00521-020-05262-2

    Article  Google Scholar 

  • Choi, S. Y., Lee, H., & Yoo, Y. (2010). The impact of information technology and transactive memory systems on knowledge sharing, application, and team performance: A field study. MIS quarterly, 855–870.

  • Cohen, B. K., Johnson, H. L., Verspoor, K., Roeder, C., & Hunter, L. E. (2010). The structural and content aspects of abstracts versus bodies of full text journal articles are different. BMC Bioinformatics, 11, 492.

    Article  Google Scholar 

  • Ding, J., Berleant, D., Nettleton, D., & Wurtele, E. (2002). Mining MEDLINE: abstracts, sentences, or phrases? Pacific Symposium on BIOCOMPUTING (pp. 326–3).

  • Ding, Y., Tang, J., & Guo, F. (2017). Identification of drug-target interactions via multiple information integration. Information Sciences, 418, 546–560.

    Article  Google Scholar 

  • EhsanBasiri, M., Abdar, M., Cifci, M. A., Nemati, S., & Acharya, U. R. (2020). A novel method for sentiment classification of drug reviews using fusion of deep and machine learning techniques. Knowledge-Based Systems. https://doi.org/10.1016/j.knosys.2020.105949

    Article  Google Scholar 

  • Fabiano, G., Marcellusi, A., & Favato, G. (2020). Public–private contribution to biopharmaceutical discoveries: A bibliometric analysis of biomedical research in UK. Scientometrics, 124, 153–168. https://doi.org/10.1007/s11192-020-03429-1

    Article  Google Scholar 

  • Follett, L., Geletta, S., & Laugerman, M. (2019). Quantifying risk associated with clinical trial termination: A text mining approach. Information Processing and Management, 56(3), 516–525. https://doi.org/10.1016/j.ipm.2018.11.009

    Article  Google Scholar 

  • Fraunhofer SCAI: Corpora for Chemical Entity Recognition. (2016). Retrieved 12 27, 2014 from Fraunhofer SCAI: http://www.scai.fraunhofer.de/en/business-research-areas/bioinformatics/research-development/information-extraction-semantic-text-analysis/named-entity-recognition/chem-corpora.html

  • Furman, D. J., III., Naskolnakorn, J., Ye, J., Kayser, A., & D’Esposito, M. (2020). Effects of dopaminergic drugs on cognitive control processes vary by genotype. Journal of Cognitive Neuroscience, 32(5), 804–821.

    Article  Google Scholar 

  • Garten, Y., Coulet, A., & Altman, R. B. (2010). Recent progress in automatically extracting information from the pharmacogenetic literature. Pharmacogenomics, 11, 1467–1489.

    Article  Google Scholar 

  • Geng, Z., Chen, G., Han, Y., Lu, G., & Li, F. (2020). Semantic relation extraction using sequential and tree-structured LSTM with attention. Information Sciences, 509, 183–192.

    Article  Google Scholar 

  • Giacomini, K. M., Krauss, R. M., Roden, D. M., Eichelbaum, M., Hayden, M. R., & Nakamura, Y. (2007). When good drugs go bad. Nature, 446, 975–977.

    Article  Google Scholar 

  • Hamburg, M. A., & Collins, F. S. (2010). The path to personalized medicine. The NEW ENGLAND JOURNAL of MEDICINE, 363, 301–304.

    Article  Google Scholar 

  • Hewett, M., Oliver, D. E., Rubin, D. L., Easton, K. L., Stuart, J. M., Altman, R. B., & Klein, T. E. (2002). PharmGKB: The Pharmacogenetics Knowledge Base. Nucleic Acids Research, 30(1), 163–165.

    Article  Google Scholar 

  • Hirsch, J. E. (2005, November 15). Proceedings of the National Academy of Sciences. An index to quantify an individual’s scientific research output, 102(46), 16569–16572.

  • Klinger, R., Kolářik, C., Fluck, J., Hofmann-Apitius, M., & Friedrich, C. M. (2008). Detection of IUPAC and IUPAC-like chemical names. Bioinformatics, 24(13), i268–i276.

    Article  Google Scholar 

  • Knowles, B. B., Howe, C. C., & Aden, D. P. (1980). Human hepatocellular carcinoma cell lines secrete the major plasma proteins and hepatitis B surface antigen. Science, 209(4455), 497–499.

  • Li, X., Peng, S., & Du, J. (2021). Towards medical knowmetrics: Representing and computing medical knowledge using semantic predications as the knowledge unit and the uncertainty as the knowledge context. Scientometrics, 126, 6225–6251. https://doi.org/10.1007/s11192-021-03880-8

    Article  Google Scholar 

  • Liu, H., Hu, Z.-Z., Zhang, J., & Wu, C. (2006). BioThesaurus: A web-based thesaurus of protein and gene names. Bioinformatics, 22(1), 103–105.

    Article  Google Scholar 

  • Lu, Z. (2011). PubMed and beyond: A survey of Web tools for searching biomedical literature. Database The Journal of Biological Databases and Curation, 2011.

  • McCray, A. T., Srinivasan, S., & Browne, A. C. (1994). Lexical methods for managing variation in biomedical terminologies. In Proceedings of the annual symposium on computer application in medical care (pp. 235–239).

  • Naseem, U., Musial, K., Eklund, P., & Prasad, M. (2020). Biomedical named-entity recognition by hierarchically fusing BioBERT representations and deep contextual-level word-embedding. In International Joint Conference on Neural Networks (IJCNN), (pp. 1–8). Glasgow, UK. https://doi.org/10.1109/IJCNN48605.2020.9206808

  • Nguyen, N., Choi, C. J., Robbins, R., Korich, R., Raymond, J., Dolezal, C., et al. (2020). Psychiatric trajectories across adolescence in perinatally HIV-exposed youth: The role of HIV infection and associations with viral load. AIDS (London, England), 34(8), 1205.

  • Percha, B., & Altman, R. B. (2015). Learning the structure of biomedical relationships from unstructured text. PLOS Computational Biology, 11(7), e1004216.

    Article  Google Scholar 

  • Quirk, C., & Poon, H. (2017). Distant Supervision for Relation Extraction beyond the Sentence Boundary. In Proceedings of the 15th conference of the European chapter of the Association for computational linguistics: Volume 1, Long Papers (pp. 1171–1182).

  • Samuels, Y., Wang, Z., Bardelli, A., Silliman, N., Ptak, J., Szabo, S., et al. (2004). High frequency of mutations of the PIK3CA gene in human cancers. Science, 304(5670), 554–554.

  • Segura-Bedmar, I., Martínez, P., & Segura-Bedmar, M. (2008). Drug name recognition and classification in biomedical texts. A case study outlining approaches underpinning automated systems. Drug Discovery Today, 13, 816–823.

    Article  Google Scholar 

  • Siu, A., Nguyen, D. B., & Weikum, G. (2013). Fast entity recognition in biomedical text. In Workshop on Data Mining for Healthcare (DMH) at the 19th ACM SIGKDD conference on Knowledge Discovery and Data Mining (KDD) 2013. Chicago, USA: Association for Computing Machinery (ACM).

  • Song, M., Kim, M., Kang, K., Kim, Y. H., & Jeon, S. (2018). Application of public knowledge discovery tool (PKDE4J) to represent biomedical scientific knowledge. Frontiers in Research Metrics and Analytics, 3, 7.

    Article  Google Scholar 

  • Takanobu, R., Zhang, T., Liu, J., & Huang, M. (2019). A Hierarchical Framework for Relation Extraction with Reinforcement Learning. Proceedings of the AAAI conference on artificial intelligence.

  • Wang, L., Mo, T., Wang, X., Chen, W., He, Q., Li, X., & Zhen, X. (2021). A hierarchical fusion framework to integrate homogeneous and heterogeneous classifiers for medical decision-making. Knowledge-Based Systems,. https://doi.org/10.1016/j.knosys.2020.106517

    Article  Google Scholar 

  • Wang, X., Yang, C., & Guan, R. (2018). A comparative study for biomedical named entity recognition. Machine Learning and Cybernetics, 9, 373–382.

    Article  Google Scholar 

  • Wang, X., Zhang, S., Wu, Y., & Yang, X. (2021). Revealing potential drug-disease-gene association patterns for precision medicine. Scientometrics, 126, 3723–3748. https://doi.org/10.1007/s11192-021-03892-4

    Article  Google Scholar 

  • Wu, Y., Liu, M., Zheng, W. J., Zhao, Z., & Xu, H. (2012). Ranking gene-drug relationships in biomedical literature using Latent Dirichlet Allocation. Pacific Symposium on Biocomputing, 2012, 422–433.

    Google Scholar 

  • Xu, R., & Wang, Q. (2012). A knowledge-driven conditional approach to extract pharmacogenomics specific drug-gene relationships from free text. Journal of Biomedical Informatics, 45(5), 827–834.

    Article  Google Scholar 

  • Xu, R., & Wang, Q. (2013). A semi-supervised approach to extract pharmacogenomics-specific drug–gene pairs from biomedical literature for personalized medicine. Journal of Biomedical Informatics, 46(4), 585–593.

    Article  Google Scholar 

  • Yang, H., Hu, B., Pan, X., Yan, S., Feng, Y., Zhang, X., & Hu, C. (2017). Deep belief network-based drug identification using near infrared spectroscopy. Journal of Innovative Optical Health Sciences, 10(2), 1–10.

    Article  Google Scholar 

Download references

Acknowledgements

The work was funded by the University of Jeddah, Saudi Arabia under Grant No (DSR-UJ-20-047-DR). This work was also supported by the Ministry of Education of the Republic of Korea and the National Research Foundation of Korea (NRF- 2019S1A5C2A03083499). The authors, therefore, acknowledge with thanks the university's technical and financial support. The main idea of the work is given and supervised by Ali Daud.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ali Daud.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Alharbey, R., Kim, J.I., Daud, A. et al. Indexing important drugs from medical literature. Scientometrics 127, 2661–2681 (2022). https://doi.org/10.1007/s11192-022-04340-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-022-04340-7

Keywords

Navigation