Abstract
In recent years, high-throughput sequencers have been generating enourmous volumes of data in hundreds of genome projects around the world. Besides being stored, the original data are transformed through multiple analysis that are realized in a computational pipeline. This poses important problems for treating these highly complex data. In this context, a model to represent, organize and guarantee accessibility, correctness and understandability to these data is essential to support the work of the biologists involved in a transcriptome project. Different formats of data, terminologies, file structures and ontologies turn data management very difficult. In this work, we propose a conceptual model for the different phases of a transcriptome high-throughput sequencing pipeline in order to represent and manage data.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Röhm, U., Blakeley, J.: Data Management for High-Throughput Genomics. In: CDIR 2009, Asilomar, CA, USA, vol. 5667, pp. 97–111 (2009)
Catell, R.: ODMG-93: The Object Database Standard. Morgan Kauffmann, San Francisco (1994)
Silberchatz, A., Korth, H.F., Sudarshan, S.: Database System Concepts, 6th edn. McGraw Hill, New York (2010)
Elmasri, R., Navathe, S.: Fundamentals of Database Systems, 4th edn. Pearson Addison Wesley, London (2005)
Ji, F.: Enhaced Bioinformatics Data Modeling Concepts and their use in Querying and Integration. PhD Thesis. Faculty of the Graduate School of The University of Texas, Arlinton (2008)
Shah, A., Ahsan, S., Jaffer, A.: Temporal Object-Oriented System (TOS) for Modeling Biological Data. Journal of American Science 5(3), 63–73 (2009)
Aberer, K.: The Use of Object-Oriented Datamodels for Biomolecular Database. OOCNS, Heidelberg, Germany (1995), http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.11.8025
Paton, N., Khan, S., Hayes, A., Moussouni, F., Brass, A., Eilbeck, K., Goble, C., Hubbard, S., Oliver, S.: Conceptual modeling of genomic information. Bioinformatics 16(6), 548–557 (2000)
Bornberg-Bauer, E., Paton, N.: Conceptual data modeling for bioinformatics. Briefing in Bioinformatics 3(2), 166–180 (2002)
Busch, N., Wedemann, G.: Modeling genomic data with type attributes, balancing stability and maintainability. BMC Bioinformatics 10(97), 1471–2105 (2009)
Macedo, J.F., Porto, F., Lifschitz, S., Picouet, P.: A Conceptual Data Model Language for the Molecular Biology Domain. In: Twentieth IEEE International Symposium on Computer-Based Medical Systems (CBMS 2007), pp. 231–236 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Huacarpuma, R.C., Holanda, M., Walter, M.E. (2011). A Conceptual Model for Transcriptome High-Throughput Sequencing Pipeline. In: Norberto de Souza, O., Telles, G.P., Palakal, M. (eds) Advances in Bioinformatics and Computational Biology. BSB 2011. Lecture Notes in Computer Science(), vol 6832. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22825-4_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-22825-4_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-22824-7
Online ISBN: 978-3-642-22825-4
eBook Packages: Computer ScienceComputer Science (R0)