Skip to main content

Mining Cell Cycle Literature Using Support Vector Machines

  • Conference paper
Book cover Artificial Intelligence: Theories and Applications (SETN 2012)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7297))

Included in the following conference series:

Abstract

While biomedical literature is rapidly increasing, text classification remains a challenge for researchers, curators and librarians. In the context of this work, we use the Caipirini (http://caipirini.org) service to report on the exploration of a literature corpus related to the G1, S, G2 and M phases of the human cell cycle respectively. We use Support Vector Machines (SVMs) and a well-studied dataset to compare each of the cell cycle phases against all others in order to find abstracts that are related to one specific phase at a time. Finally we measure the performance of the results using the standard accuracy, precision and recall metrics. We find differences between the results of each of the four phases and we compare with previous findings of relevant work. We conclude that the results concur and help interpreting the observed classification performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Krallinger, M., Valencia, A.: Text-mining and information-retrieval services for molecular biology. Genome Biol. 6(7), 224 (2005), doi:10.1186/gb-2005-6-7-224

    Article  Google Scholar 

  2. Krallinger, M., Erhardt, R.A., Valencia, A.: Text-mining approaches in molecular biology and biomedicine. Drug Discov. Today 10(6), 439–445 (2005), doi:10.1016/S1359-6446(05)03376-3

    Article  Google Scholar 

  3. Lewis, J., Ossowski, S., Hicks, J., Errami, M., Garner, H.R.: Text similarity: an alternative way to search MEDLINE. Bioinformatics 22(18), 2298–2304 (2006), doi:btl388

    Google Scholar 

  4. Goetz, T., von der Lieth, C.-W.: PubFinder: a tool for improving retrieval rate of relevant PubMed abstracts. Nucleic Acids Res. 33, W774–W778 (2005)

    Google Scholar 

  5. Poulter, G.L., Rubin, D.L., Altman, R.B., Seoighe, C.: MScanner: a classifier for retrieving Medline citations. Bioinformatics 9, 108 (2008), doi:1471-2105-9-108

    Google Scholar 

  6. Tuchler, T., Velez, G., Graf, A., Kreil, D.P.: BibGlimpse: the case for a light-weight reprint manager in distributed literature research. BMC Bioinformatics 9, 406 (2008), doi:1471-2105-9-406

    Google Scholar 

  7. Nobata, C., Cotter, P., Okazaki, N., Rea, B., Sasak1, Y., Tsuruoka, Y., Tsujii, J.I., Ananiadou, S.: Kleio: A Knowledge-enriched Information Retrieval System for Biology. In: 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Singapore, pp. 787–788. Association for Computing Machinery (2008)

    Google Scholar 

  8. Fontaine, J.F., Barbosa-Silva, A., Schaefer, M., Huska, M.R., Muro, E.M., Andrade-Navarro, M.A.: MedlineRanker: flexible ranking of biomedical literature. Nucleic Acids Res. 37(Web Server issue), W141–W146 (2009), doi:gkp353

    Google Scholar 

  9. Soldatos, T.G., O’Donoghue, S.I., Satagopam, V.P., Barbosa-Silva, A., Pavlopoulos, G.A., Wanderley-Nogueira, A.C., Soares-Cavalcanti, N.M., Schneider, R.: Caipirini: using gene sets to rank literature. BioData Mining 5(1), 1 (2012), doi:10.1186/1756-0381-5-1

    Article  Google Scholar 

  10. Soldatos, T., O’Donoghue, S.I., Satagopam, V.P., Brown, N.P., Jensen, L.J., Schneider, R.: Martini: using literature keywords to compare gene sets. Nucleic Acid Res. 38(1), 26–38 (2010), doi:10.1093/nar/gkp876

    Google Scholar 

  11. Jensen, L.J., Jensen, T.S., de Lichtenberg, U., Brunak, S., Bork, P.: Co-evolution of transcriptional and post-translational cell-cycle regulation. Nature 443(7111), 594–597 (2006), doi:10.1038/nature05186

    Google Scholar 

  12. PubMed, http://pubmed.org

  13. Entrez gene database, http://www.ncbi.nlm.nih.gov/sites/entrez?db=gene

  14. Ensembl, http://ensembl.org

  15. Fan, R.-E., Chang, K.W., Hsieh, C.-J., Wang, X.-R., Lin, C.-J.: LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research 9, 1871–1874 (2008)

    MATH  Google Scholar 

  16. Medical Subject Headings (MeSH) Fact sheet. In: National Library of Medicine (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Soldatos, T.G., Pavlopoulos, G.A. (2012). Mining Cell Cycle Literature Using Support Vector Machines. In: Maglogiannis, I., Plagianakos, V., Vlahavas, I. (eds) Artificial Intelligence: Theories and Applications. SETN 2012. Lecture Notes in Computer Science(), vol 7297. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30448-4_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-30448-4_35

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-30447-7

  • Online ISBN: 978-3-642-30448-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics