Text Mining on PubMed

Ivanisenko, Timofey V.; Demenkov, Pavel S.; Ivanisenko, Vladimir A.

doi:10.1007/978-3-642-41281-3_6

Timofey V. Ivanisenko³,
Pavel S. Demenkov³ &
Vladimir A. Ivanisenko³

2227 Accesses

Abstract

A technology of linguistic analysis with the use of computer methods is called a text mining.

Computer tools based on this technology can provide a wide range of tasks, including:

1.
The task of finding a relevant literature with the user-specified criteria and determination of the correspondence between single article or manually specified picks of articles and researching area of knowledge or a set of predesignated areas
2.
The task of identification and extraction of names of biological objects that can be found in the raw text (e.g., genes, proteins, metabolites) with extra information on them, such as the type of object and names of its synonyms
3.
The task of establishment of relationships between objects that had been automatically recognized in text with the representation of the obtained data in a form convenient for the further analysis, for example, in the form of associative networks

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Text Mining in Bioinformatics

Biomedical Literature Mining and Its Components

A Text Mining Protocol for Mining Biological Pathways and Regulatory Networks from Biomedical Literature

References

Shatkay H, Wilbur WJ (2000) Finding themes in medline documents: probabilistic similarity search. In: Hoppenbro J, Souza Lima T, Papazoglou M, Sheth A (eds) Proceedings IEEE advances in digital libraries 2000, Washington DC, May 2000, pp 183–192
Chapter Google Scholar
Joyce T, Needham RM (1997) The thesaurus approach to information retrieval. American documentation (1958) 9:192–197. In: Sparck Jones K, Willet P (eds) Readings in information retrieval. Morgan Kaufmann Publishers Inc, California (1997), pp 15–20
Google Scholar
Salton G (1968) Automatic information organization and retrieval. McGraw Hill, New York
Google Scholar
Sebastiani F (1999) Machine learning in automated text categorization. Technical report IEI-B4-31-1999, Istituto di Elaborazione dell’Informazione. CNR, Pisa
Google Scholar
Кириченко КМ, Герасимов МБ (2001) Обзор методов кластеризации текстовых документов. Материалы международной конференции Диалог, т 2, Аксаково, 2001
Google Scholar
Гаврилова ТА, Хорошевский ВФ (2000) Базы знаний интеллектуальных систем. Учебник, Питер, Санкт-Петербург, 2000
Google Scholar
Ильин Н, Киселëв С, Танков С, Рябышкин В (2006) Технологии извлечения знаний из текста, Открытые системы, 6, 2006
Google Scholar
Schuler G, Epstein J, Ohkawa H, Kans J (1996) Entrez: molecular biology database and retrieval system. Methods Enzymol 266:141–162
Google Scholar
Muller HM, Kenny EE, Sternberg PW (2004) Textpresso: an ontology-based information retrieval and extraction system for biological literature. PLoS Biol 2:309
Article Google Scholar
Becker K et al (2003) PubMatrix: a tool for multiplex literature mining. BMC Bioinforma 4:61
Article Google Scholar
Krauthammer M, Nenadic G (2004) Term identification in the biomedical literature. J Biomed Inform 37:512–526
Article Google Scholar
Krallinger M, Morgan A, Smith L, Leitner F, Tanabe L, Wilbur J, Hirschman L, Valencia A (2008) Evaluation of text mining systems for biology: overview of the Second BioCreative community challenge. Genome Biol 9(2):1
Article Google Scholar
Ananiadou S, McNaught J (eds) (2006) Text mining for biology and biomedicine. Artech House, Norwood
Google Scholar
Collier N, Nobata C, Tsujii J (2000) Extracting the names of genes and gene products with a hidden Markov model. In: Proceedings of COLING 2000, Saarbruecken, pp 201–207
Google Scholar
Morgan A, Yeh A, Hirschman L, Colosimo M (2003) Gene name extraction using FlyBase resources. In: Proceedings of NLP in biomedicine. ACL 2003, Sapporo, pp 1–8
Google Scholar
Kazama J, Makino T, Ohta Y, Tsujii J (2002) Tuning support vector machines for biomedical named entity recognition. In: ACL-02 workshop on natural language processing in biomedical applications, Pennsylvania, July 2002
Google Scholar
Kim JD, Ohta T, Tateisi Y, Tsujii J (2003) GENIA corpus – a semantically annotated corpus for bio-textmining. Bioinformatics 19(1):180–182
Article Google Scholar
Cohen KB, Hunter L (2005) Natural language processing and systems biology. In: Dubitzky W, Azuaje F (eds) Artificial intelligence and systems biology. Springer, Dordrecht
Google Scholar
Liu H, Hu ZZ, Zhang J, Wu C (2006) BioThesaurus: a web-based thesaurus of protein and gene names. Bioinformatics 22:103–105
Article Google Scholar
Bairoch A, Apweiler R, Wu CH et al (2007) The Universal Protein Resource (UniProt). Nucleic Acids Res 35:193–197
Google Scholar
Wheeler D, Church D, Federhen S et al (2003) Database resources of the National Center for Biotechnology. Nucleic Acids Res 31:28–33
Article Google Scholar
Eppig JT et al (2005) The Mouse Genome Database (MGD): from genes to mice — a community resource for mouse biology. Nucleic Acids Res 33:471–475
Article Google Scholar
Christie KR et al (2004) Saccharomyces Genome Database (SGD) provides tools to identify and analyze sequences from Saccharomyces cerevisiae and related sequences from other organisms. Nucleic Acids Res 32:311–314
Article Google Scholar
De la Cruz N et al (2005) The Rat Genome Database (RGD): developments towards a phenome database. Nucleic Acids Res 33:485–491
Article Google Scholar
Drysdale RA, Crosby MA (2005) FlyBase: genes and gene models. Nucleic Acids Res 33:390–395
Article Google Scholar
Chen N et al (2005) WormBase: a comprehensive data resource for Caenorhabditis biology and genomics. Nucleic Acids Res 33:383–389
Article Google Scholar
Tsuruoka Y, Tsujii J (2003) Boosting precision and recall of dictionary-based protein name recognition. In: Ananiadou S, Tsujii J (eds) Proceedings of the ACL 2003 workshop on natural language processing in biomedicine, Stroudsburg, July 2003, vol 13. Association for Computational Linguistics, Stroudsburg, pp 41–48
Chapter Google Scholar
Ohta T, Tateishi Y, Mima H, Tsujii J (2002) Genia corpus: an annotated research abstract corpus in molecular biology domain. In: Proceedings of the human language technology conference, San Diego, March 2002
Google Scholar
Hakenberg J et al (2008) Inter-species normalization of gene mentions with Gnat. Bioinformatics 24:126–132
Article Google Scholar
Tsuruoka Y, Tsujii J, Ananiadou S (2008) FACTA: a text search engine for finding associated biomedical concepts. Oxford J 24(21):2559–2560
Google Scholar
Scheer M, Grote A, Chang A et al (2011) BRENDA, the enzyme information system in 2011. Nucleic Acids Res 39:670–676
Article Google Scholar
Blaschke C, Valencia A (2001) The potential use of SUISEKI as a protein interaction discovery tool. Genome Inform 12:123–134
Google Scholar
Chen H, Sharp BM (2004) Content-rich biological network constructed by mining PubMed abstracts. BMC Bioinforma 5:147
Article Google Scholar
Nikitin A, Egorov S, Daraselia N, Mazo I (2003) Pathway studio – the analysis and navigation of molecular networks. Bioinformatics 19(16):2155–2157
Article Google Scholar
Demenkov PS, Ivanisenko TV, Kolchanov NA, Ivanisenko VA (2012) ANDVisio: a new tool for graphic visualization and analysis of literature mined associative gene networks in the ANDSystem. Silico Biol 11(3):149–161
Google Scholar
Demenkov PS, Aman EE, Ivanisenko VA (2008) Associative network discovery (AND) – the computer system for automated reconstruction networks of associative knowledge about molecular-genetic interactions. Comput Technol 13(2):15–19
Google Scholar
Podkolodnaya OA, Yarkova EE, Demenkov PS, Konovalova OS, Ivanisenko VA, Kolchanov NA (2011) Application of the ANDCell computer system to reconstruction and analysis of associative networks describing potential relationships between myopia and glaucoma. Russ J Genet 1(1):21–28
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Cytology and Genetics SB RAS, Laboratory of the Computer Proteomics, Novosibirsk, Russia
Timofey V. Ivanisenko, Pavel S. Demenkov & Vladimir A. Ivanisenko

Authors

Timofey V. Ivanisenko
View author publications
You can also search for this author in PubMed Google Scholar
Pavel S. Demenkov
View author publications
You can also search for this author in PubMed Google Scholar
Vladimir A. Ivanisenko
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Timofey V. Ivanisenko .

Editor information

Editors and Affiliations

College of Life Sciences, Zhejiang University, Hangzhou, People's Republic of China
Ming Chen
Department of Bioinformatics and Medical Informatics, Bielefeld University, Bielefeld, Germany
Ralf Hofestädt

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Ivanisenko, T.V., Demenkov, P.S., Ivanisenko, V.A. (2014). Text Mining on PubMed. In: Chen, M., Hofestädt, R. (eds) Approaches in Integrative Bioinformatics. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41281-3_6

Download citation

DOI: https://doi.org/10.1007/978-3-642-41281-3_6
Published: 23 October 2013
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41280-6
Online ISBN: 978-3-642-41281-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Text Mining on PubMed

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Text Mining in Bioinformatics

Biomedical Literature Mining and Its Components

A Text Mining Protocol for Mining Biological Pathways and Regulatory Networks from Biomedical Literature

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Text Mining on PubMed

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Text Mining in Bioinformatics

Biomedical Literature Mining and Its Components

A Text Mining Protocol for Mining Biological Pathways and Regulatory Networks from Biomedical Literature

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation