Abstract
We present a novel RDBMS-based framework for interactively querying and exploring large-scale bio-science research data. We focus on the interactive exploration model and its evaluation support using Bloom filter indexing techniques for Boolean containment expressions. In particular, our framework helps explore structured research data augmented with schema-less contextual information. Our experiments show significant improvements over traditional indexing techniques, enabling scientists to move from batch-oriented to interactive exploration of research data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
omics - an informal reference to a field of study in biology ending in -omics, such as Genomics, Proteomics, Metabolomics etc.
References
Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970). https://doi.org/10.1145/362686.362692
Copeland, G.P., Khoshafian, S.: A decomposition storage model. In: Navathe, S.B. (ed.) Proceedings of the 1985 ACM SIGMOD International Conference on Management of Data, Austin, Texas, USA, 28–31 May 1985. pp. 268–279. ACM Press (1985). https://doi.org/10.1145/318898.318923
Corwin, J., Silberschatz, A., Miller, P.L., Marenco, L.N.: Application of information technology: dynamic tables: an architecture for managing evolving, heterogeneous biomedical data in relational database management systems. JAMIA 14(1), 86–93 (2007). https://doi.org/10.1197/jamia.M2189
Guo, D., Wu, J., Chen, H., Yuan, Y., Luo, X.: The dynamic bloom filters. IEEE Trans. Knowl. Data Eng. 22(1), 120–133 (2010). https://doi.org/10.1109/TKDE.2009.57
Liu, Z.H., Hammerschmidt, B.C., McMahon, D.: JSON data management: supporting schema-less development in RDBMS. In: Dyreson, C.E., Li, F., Özsu, M.T. (eds.) International Conference on Management of Data, SIGMOD 2014, Snowbird, UT, USA, 22–27 June 2014. pp. 1247–1258. ACM (2014). https://doi.org/10.1145/2588555.2595628
Sansone, S.A., Rocca-Serra, P., Field, D., Maguire, E., Taylor, C., et al.: Toward interoperable bioscience data. Nat. Genet. 44(2), 121–126 (2012). https://www.nature.com/articles/ng.1054
Wang, X., Williams, C., Liu, Z.H., Croghan, J.: Big data management challenges in health research - a literature review. Briefings Bioinform. 20(1), 156–167 (2019). https://doi.org/10.1093/bib/bbx086
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Doniparthi, G., Mühlhaus, T., Deßloch, S. (2020). A Bloom Filter-Based Framework for Interactive Exploration of Large Scale Research Data. In: Darmont, J., Novikov, B., Wrembel, R. (eds) New Trends in Databases and Information Systems. ADBIS 2020. Communications in Computer and Information Science, vol 1259. Springer, Cham. https://doi.org/10.1007/978-3-030-54623-6_15
Download citation
DOI: https://doi.org/10.1007/978-3-030-54623-6_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-54622-9
Online ISBN: 978-3-030-54623-6
eBook Packages: Computer ScienceComputer Science (R0)