Abstract
The advancement of high-throughput technologies has considerably increased the amount of research data generated from bio-science experiments. The integrated analysis of these large datasets provides opportunities to understand complex biological systems better. We present a novel research data management framework that uses a hybrid relational and NoSQL data model for interactively querying and exploring large-scale bio-science research data. Our framework uses a fast, scalable, space-efficient, and flexible indexing scheme leveraging bitmaps purpose-built for exploratory data analysis and supports containment, point, and range query types.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
omics - an informal reference to a field of study in biology ending in -omics, such as Genomics, Proteomics, Metabolomics etc.
References
Copeland, G.P., Khoshafian, S.: A decomposition storage model. In: Navathe, S.B. (ed.) Proceedings of the 1985 ACM SIGMOD International Conference on Management of Data, Austin, Texas, USA, 28–31 May 1985, pp. 268–279. ACM Press (1985). https://doi.org/10.1145/318898.318923
Doniparthi, G., Mühlhaus, T., Deßloch, S.: A bloom filter-based framework for interactive exploration of large scale research data. In: Darmont, J., Novikov, B., Wrembel, R. (eds.) ADBIS 2020. CCIS, vol. 1259, pp. 166–176. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-54623-6_15
Gadepally, V., et al.: BigDAWG version 0.1. In: 2017 IEEE High Performance Extreme Computing Conference (HPEC), pp. 1–7 (2017). https://doi.org/10.1109/HPEC.2017.8091077
Kaur, K., Rani, R.: Managing data in healthcare information systems: many models, one solution. Computer 48(3), 52–59 (2015). https://doi.org/10.1109/MC.2015.77
Liu, Z.H., Gawlick, D.: Management of flexible schema data in RDBMSs - opportunities and limitations for NoSQL. In: Seventh Biennial Conference on Innovative Data Systems Research, CIDR 2015, Asilomar, CA, USA, 4–7 January 2015 (2015). http://cidrdb.org/cidr2015/Papers/CIDR15_Paper5.pdf. Online Proceedings. www.cidrdb.org
Mishra, C., Koudas, N.: Interactive query refinement. In: Kersten, M.L., Novikov, B., Teubner, J., Polutin, V., Manegold, S. (eds.) 12th International Conference on Extending Database Technology, Saint Petersburg, Russia, EDBT 2009, 24–26 March 2009, Proceedings. ACM International Conference Proceeding Series, vol. 360, pp. 862–873. ACM (2009). https://doi.org/10.1145/1516360.1516459
Nti-Addae, Y., et al.: Benchmarking database systems for genomic selection implementation. Database J. Biol. Databases Curation 2019, baz096 (2019). https://doi.org/10.1093/database/baz096
Sansone, S.A., Rocca-Serra, P., Field, D., Maguire, E., Taylor, C., et al.: Toward interoperable bioscience data. Nat. Genet. 44(2), 121–126 (2012). https://www.nature.com/articles/ng.1054
Stonebraker, M., Brown, P., Zhang, D., Becla, J.: SciDB: a database management system for applications with complex analytics. Comput. Sci. Eng. 15(3), 54–62 (2013). https://doi.org/10.1109/MCSE.2013.19
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Doniparthi, G., Mühlhaus, T., Deßloch, S. (2021). A Hybrid Data Model and Flexible Indexing for Interactive Exploration of Large-Scale Bio-science Data. In: Bellatreche, L., et al. New Trends in Database and Information Systems. ADBIS 2021. Communications in Computer and Information Science, vol 1450. Springer, Cham. https://doi.org/10.1007/978-3-030-85082-1_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-85082-1_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-85081-4
Online ISBN: 978-3-030-85082-1
eBook Packages: Computer ScienceComputer Science (R0)