Skip to main content

A Hybrid Data Model and Flexible Indexing for Interactive Exploration of Large-Scale Bio-science Data

  • Conference paper
  • First Online:
New Trends in Database and Information Systems (ADBIS 2021)

Abstract

The advancement of high-throughput technologies has considerably increased the amount of research data generated from bio-science experiments. The integrated analysis of these large datasets provides opportunities to understand complex biological systems better. We present a novel research data management framework that uses a hybrid relational and NoSQL data model for interactively querying and exploring large-scale bio-science research data. Our framework uses a fast, scalable, space-efficient, and flexible indexing scheme leveraging bitmaps purpose-built for exploratory data analysis and supports containment, point, and range query types.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://nfdi4plants.de/.

  2. 2.

    omics - an informal reference to a field of study in biology ending in -omics, such as Genomics, Proteomics, Metabolomics etc.

References

  1. Copeland, G.P., Khoshafian, S.: A decomposition storage model. In: Navathe, S.B. (ed.) Proceedings of the 1985 ACM SIGMOD International Conference on Management of Data, Austin, Texas, USA, 28–31 May 1985, pp. 268–279. ACM Press (1985). https://doi.org/10.1145/318898.318923

  2. Doniparthi, G., Mühlhaus, T., Deßloch, S.: A bloom filter-based framework for interactive exploration of large scale research data. In: Darmont, J., Novikov, B., Wrembel, R. (eds.) ADBIS 2020. CCIS, vol. 1259, pp. 166–176. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-54623-6_15

    Chapter  Google Scholar 

  3. Gadepally, V., et al.: BigDAWG version 0.1. In: 2017 IEEE High Performance Extreme Computing Conference (HPEC), pp. 1–7 (2017). https://doi.org/10.1109/HPEC.2017.8091077

  4. Kaur, K., Rani, R.: Managing data in healthcare information systems: many models, one solution. Computer 48(3), 52–59 (2015). https://doi.org/10.1109/MC.2015.77

  5. Liu, Z.H., Gawlick, D.: Management of flexible schema data in RDBMSs - opportunities and limitations for NoSQL. In: Seventh Biennial Conference on Innovative Data Systems Research, CIDR 2015, Asilomar, CA, USA, 4–7 January 2015 (2015). http://cidrdb.org/cidr2015/Papers/CIDR15_Paper5.pdf. Online Proceedings. www.cidrdb.org

  6. Mishra, C., Koudas, N.: Interactive query refinement. In: Kersten, M.L., Novikov, B., Teubner, J., Polutin, V., Manegold, S. (eds.) 12th International Conference on Extending Database Technology, Saint Petersburg, Russia, EDBT 2009, 24–26 March 2009, Proceedings. ACM International Conference Proceeding Series, vol. 360, pp. 862–873. ACM (2009). https://doi.org/10.1145/1516360.1516459

  7. Nti-Addae, Y., et al.: Benchmarking database systems for genomic selection implementation. Database J. Biol. Databases Curation 2019, baz096 (2019). https://doi.org/10.1093/database/baz096

  8. Sansone, S.A., Rocca-Serra, P., Field, D., Maguire, E., Taylor, C., et al.: Toward interoperable bioscience data. Nat. Genet. 44(2), 121–126 (2012). https://www.nature.com/articles/ng.1054

  9. Stonebraker, M., Brown, P., Zhang, D., Becla, J.: SciDB: a database management system for applications with complex analytics. Comput. Sci. Eng. 15(3), 54–62 (2013). https://doi.org/10.1109/MCSE.2013.19

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gajendra Doniparthi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Doniparthi, G., Mühlhaus, T., Deßloch, S. (2021). A Hybrid Data Model and Flexible Indexing for Interactive Exploration of Large-Scale Bio-science Data. In: Bellatreche, L., et al. New Trends in Database and Information Systems. ADBIS 2021. Communications in Computer and Information Science, vol 1450. Springer, Cham. https://doi.org/10.1007/978-3-030-85082-1_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-85082-1_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-85081-4

  • Online ISBN: 978-3-030-85082-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics