Skip to main content

Weighting Scheme Methods for Enhanced Genomic Annotation Prediction

  • Conference paper
  • First Online:
Computational Intelligence Methods for Bioinformatics and Biostatistics (CIBB 2013)

Abstract

Functional genomic annotation data banks, which store the associations between genes (or a gene products) and terms of controlled vocabularies describing their features, are paramount in computational biology. Despite their undeniable importance, these data sources cannot be considered neither complete nor totally accurate; in their curated updates often new annotations are added and some of their annotations are revised. In this scenario, computational methods that are able to quicken the curation process of such data banks are very important. To this end, the Latent Semantic Indexing (LSI) by Singular Value Decomposition, and its Semantically IMproved (SIM) variant, have shown to be able to predict novel functional annotations from a set of available ones. In this work, we propose a further improvement of those techniques, based on a preparatory weighting of the associations between genes (or a gene products) and functional annotation terms. We tested the effectiveness of our approach on nine Gene Ontology annotation datasets. The results demonstrated that this technique is able to improve novel annotation predictions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Pandey, G., Kumar, V., Steinbach, M.: Computational approaches for protein function prediction: a survey. Technical report, Department of Computer Science and Engineering, University of Minnesota, Minneapolis, MN, USA (2006)

    Google Scholar 

  2. Draghici S., Done B., Purvesh K., Done A.: Semantic analysis of genome annotations using weighting schemes. In: Proceedings of IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, pp. 212–218 (2007)

    Google Scholar 

  3. Chicco, D., Tagliasacchi, M., Masseroli, M.: Biomolecular annotation prediction through information integration, In: Proceedings of CIBB 2011 - Computational Intelligence Methods for Bioinformatics and Biostatistics, pp. 1–8 (2011)

    Google Scholar 

  4. Canakoglu, A., Ghisalberti, G., Masseroli, M.: Integration of biomolecular interaction data in a genomic and proteomic data warehouse to support biomedical knowledge discovery. In: Biganzoli, E., Vellido, A., Ambrogi, F., Tagliaferri, R. (eds.) CIBB 2011. LNCS, vol. 7548, pp. 112–126. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  5. Pessina, F., Masseroli, M., Canakoglu, A.: Visual composition of complex queries on an integrative genomic and proteomic data warehouse. Engineering 5(10B), 94–98 (2013)

    Article  Google Scholar 

  6. Gene Ontology Consortium: Creating the gene ontology resource: design and implementation. Genome Res. 11, 1425–1433 (2001)

    Article  Google Scholar 

  7. Masseroli, M., Tagliasacchi, M.: Web resources for gene list analysis in biomedicine. In: Lazakidou, A. (ed.) Web-based Applications in Health Care and Biomedicine. Annals of Information Systems Series, vol. 7, pp. 117–141. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  8. Salton, G.: Introduction to Modern Information Retrieval. McGraw-Hill, New York (1983)

    MATH  Google Scholar 

  9. Masseroli, M., Tagliasacchi, M., Chicco, D.: Semantically improved genome-wide prediction of gene ontology annotations. In: Proceedings of the 11th IEEE International Conference on Intelligent Systems Design and Applications, pp. 1080–1085 (2011)

    Google Scholar 

  10. Drineas, P.: Clustering large graphs via the singular values decomposition: theoretical advances in data clustering. Mach. Learn. 56, 9–33 (2004). (guest editors: Nina Mishra and Rajeev Motwani)

    Article  MATH  Google Scholar 

  11. Tanoue, J., Yoshikawa, M., Uemura, S.: The GeneAround GO viewer. Bioinformatics 18, 1705–1706 (2002)

    Article  Google Scholar 

  12. Masseroli, M., Tagliasacchi, M.: Anomaly-free prediction of gene ontology annotations using Bayesian networks. In: 9th IEEE International Conference on Bioinformatics and Bioengineering, pp. 107–114 (2009)

    Google Scholar 

  13. Chen, J., Saad, Y.: Lanczos vector versus singular vectors for effective dimension reduction. Technical report, Department of Computer Science and Engineering, University of Minnesota, Minneapolis, MN, USA (2008)

    Google Scholar 

  14. Nuzzo, A., Mulas, F., Gabetta, M., Arbustini, E., Zupan, B., Larizza, C., Bellazzi, R.: Text Mining approaches for automated literature knowledge extraction and representation. Stud. Health Technol. Inf. 160(Pt 2), 954–958 (2010)

    Google Scholar 

  15. Ceri, S.: Chapter 1: search computing. In: Ceri, S., Brambilla, M. (eds.) Search Computing. LNCS, vol. 5950, pp. 3–10. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  16. Chicco, D.: Integration of bioinformatics web services through the search computing technology. Technical report, Dipartimento di Elettronica e Informazione, Politecnico di Milano, Milan, Italy (2012)

    Google Scholar 

  17. Masseroli, M., Ghisalberti, G., Ceri, S.: Bio-search computing: integration and global ranking of bioinformatics search results. J. Integr. Bioinf. 8(166), 1–9 (2011)

    Google Scholar 

Download references

Acknowledgments

This research is part of the Search Computing project (2008–2013) funded by the European Research Council (ERC), IDEAS Advanced Grant. The authors would like to thank Luke Lloyd-Jones for the help in the revision of the English style of the text.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pietro Pinoli .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Pinoli, P., Chicco, D., Masseroli, M. (2014). Weighting Scheme Methods for Enhanced Genomic Annotation Prediction. In: Formenti, E., Tagliaferri, R., Wit, E. (eds) Computational Intelligence Methods for Bioinformatics and Biostatistics. CIBB 2013. Lecture Notes in Computer Science(), vol 8452. Springer, Cham. https://doi.org/10.1007/978-3-319-09042-9_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-09042-9_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-09041-2

  • Online ISBN: 978-3-319-09042-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics