Skip to main content

Design and Implementation of ProteinWorldDB

  • Conference paper
  • 1027 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 7409))

Abstract

This work involves the comparison of protein information in a genomic scale. The main goal is to improve the quality and interpretation of biological data, besides our understanding of biological systems and their interactions. Stringent comparisons were obtained after the application of the Smith-Waterman algorithm in a pair wise manner to all predicted proteins encoded in both completely sequenced and unfinished genomes available in the public database RefSeq. Comparisons were run through a computational grid and the complete result reaches a volume of over 900 GB. Consequently, the database system design is a critical step in order to store and manage the information from comparisons’ results. This paper describes database conceptual design issues for the creation of a database that represents a data set of protein sequence cross-comparisons. We show that our conceptual schema and its relational mapping enables users to extract relevant information, from simple to complex queries integrating distinct data sources.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   49.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. KEGG: Kyoto Encyclopedia of Genes and Genomes, http://www.genome.jp/kegg/

  2. NCBI Taxonomy Database, http://www.ncbi.nlm.nih.gov/Taxonomy/

  3. PostgreSQL, http://postgresql.org

  4. The Gene Ontology, http://www.geneontology.org/

  5. The Pfam Protein Families Database, http://pfam.sanger.ac.uk

  6. Chen, J.Y., Carlis, J.V.: Genomic data modeling. Information, Special issue: Data Management in Bioinformatics 28, 287–310 (2003)

    MATH  Google Scholar 

  7. Elmasri, R., Ji, F., Fu, J., Zhang, Y., Raja, Z.: Modelling Concepts and Database Implementation Techniques For Complex Biological Data. International Journal of Bioinformatics Research and Applications 3, 366–388 (2007)

    Article  Google Scholar 

  8. Keet, C.M.: Biological Data and Conceptual Modelling Methods. Journal of Conceptual Modeling (2003)

    Google Scholar 

  9. Mount, D.: Bioinformatics: Sequence and Genome Analysis. Cold Spring Harbor Laboratory Press (2004)

    Google Scholar 

  10. Navathe, S.B., Kogelnik, A.M.: The Challenges of Modeling Biological Information for Genome Databases. In: Chen, P.P., Akoka, J., Kangassalu, H., Thalheim, B. (eds.) Conceptual Modeling. LNCS, vol. 1565, pp. 168–182. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  11. Nelson, M.R., Reisinger, S.J., Henry, S.G.: Designing databases to store biological information. BIOSILICO 1, 134–142 (2003)

    Article  Google Scholar 

  12. Otto, T.D., Catanho, M., Tristão, C., Bezerra, M., Fernandes, R.M., Elias, G.S., Scaglia, A.C., Bovermann, B., Berstis, V., Lifschitz, S., de Miranda, A.B., Degrave, W.: ProteinWorldDB: Querying radical pairwise alignments among protein sets from complete genomes. Bioinformatics (2010)

    Google Scholar 

  13. Pastor, O.: Conceptual Modeling Meets the Human Genome. In: Li, Q., Spaccapietra, S., Yu, E., Olivé, A. (eds.) ER 2008. LNCS, vol. 5231, pp. 1–11. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  14. Pearson, W.: SSearch. Genomics 11, 635–650 (1991)

    Article  Google Scholar 

  15. Smith, T., Waterman, M.: Comparison of Biosequences. Advances in Applied Mathematics 2, 482–489 (1981)

    Article  MathSciNet  MATH  Google Scholar 

  16. Zhou, X., Song, I.Y.: Conceptual Modeling of Genetic Studies and Pharmacogenetics. In: Gervasi, O., Gavrilova, M.L., Kumar, V., Laganá, A., Lee, H.P., Mun, Y., Taniar, D., Tan, C.J.K. (eds.) ICCSA 2005, Part III. LNCS, vol. 3482, pp. 402–415. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lifschitz, S. et al. (2012). Design and Implementation of ProteinWorldDB. In: de Souto, M.C., Kann, M.G. (eds) Advances in Bioinformatics and Computational Biology. BSB 2012. Lecture Notes in Computer Science(), vol 7409. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31927-3_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-31927-3_13

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-31926-6

  • Online ISBN: 978-3-642-31927-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics