Abstract
Large scale scientific data sets are often analyzed for the purpose of supporting workflow and querying. User need to query over different data sources. These systems manage intermediate results. Most prototypes are complex and have an ad hoc design. These require extensive modifications in case of growth of data and change of scale, in terms of data or number of users. New data sources may arise to further complicate the ad hoc design. The polystore data management approach provides ‘data independence’ for changes in data profile, including addition of cloud data resources. The users are often provided a quasi-relational query language. In many cases, the polystore systems support distinct tasks that are user defined workflow activity, in addition to providing a common view of data resources.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Duggan, J., et al.: The bigdawg polystore system. ACM Sigmod Rec. 44(2), 11–16 (2015)
Saeed, M., et al.: Multiparameter Intelligent Monitoring in Intensive Care II (MIMIC-II): a public-access intensive care unit database. Crit. Care Med. 39(5), 952 (2011)
Armbrust, M., et al.: Spark sql: relational data processing in spark. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. ACM (2015)
Apache Spark. https://spark.apache.org/documentation.html
What is PolyBase. https://docs.microsoft.com/en-us/sql/relational-databases/polybase/polybase-guide?view=sql-server-2017
Kolev, B., Bondiombouy, C., Valduriez, P., Jiménez-Peris, R., Pau, R., Pereira, J.: The cloudmdsql multistore system. In: Proceedings of the 2016 International Conference on Management of Data. ACM (2016)
Law, N.M., et al.: The Palomar Transient Factory: system overview, performance, and first results. Publ. Astron. Soc. Pac. 121(886), 1395 (2009)
Information on IRSA. http://irsa.ipac.caltech.edu/about.html
Laher, R.R., et al.: IPAC image processing and data archiving for the Palomar Transient Factory. Publ. Astron. Soc. Pac. 126(941), 674 (2014)
Pence, W.D., et al.: Definition of the flexible image transport system (fits), version 3.0. Astronomy & Astrophysics 524, A42 (2010)
IRSA web based system. http://irsa.ipac.caltech.edu/applications/ptf/
Robitaille, T.P., et al.: Astropy: a community Python package for astronomy. Astron. Astrophys. 558, A33 (2013)
Information on JS9 FITS image viewer. https://js9.si.edu/
http://istc-bigdata.org/index.php/istc-releases-open-source-code-for-bigdawg-polystore-system/
Gadepally, V., et al.: The BigDAWG polystore system and architecture. In: 2016 IEEE High Performance Extreme Computing Conference (HPEC), Waltham, MA, pp. 1–6 (2016)
Elmore, A., et al.: A demonstration of the BigDAWG polystore system. Proc. VLDB Endow. 8. 1908–1911 (2015). https://doi.org/10.14778/2824032.2824098
Kolev, B., Valduriez, P., Bondiombouy, C., Jiménez-Peris, R., Pau, R., Pereira, J.: CloudMdsQL: querying heterogeneous cloud data stores with a common language. Distrib. Parallel Databases 34(4), 463–503 (2016)
Kolev, B., et al.: Design and Implementation of the CloudMdsQL Multistore System, 4 July 2016. https://hal-lirmm.ccsd.cnrs.fr/lirmm-01341172/document
O’Brien, K.: Polystore Systems for Complex Data Management. HPEC 2017. https://bigdawg.mit.edu/sites/default/files/documents/20170910r3-BigDAWG_Details.pdf
Sun, Jun: Information requirement elicitation in mobile commerce. Commun. CM (CACM) 46(12), 45–47 (2003)
Shashank, S., et al.: PDSPTF: polystore database system for scalability and access to PTF time-domain astronomy data archives. In: International Workshop on Polystores and Other Systems for Heterogeneous Data (Poly’2018) co-located with VLDB 2018 (2018)
Tamer Özsu, M., Valduriez, P.: Principles of Distributed Database Systems. Springer, 2018-19
Valduriez, P., Danforth, S.: Functional SQL, an SQL Upward Compatible. Database Programming Language. Information Sciences (1992)
Khan, Y., Zimmermann, A., Jha, A., Rebholz-Schuhmann, D., Sahay, R.: Querying Web Polystores. In: 2017 IEEE International Conference on Big Data (Big Data), December 2017
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Patidar, R.G., Shrestha, S., Bhalla, S. (2018). Polystore Data Management Systems for Managing Scientific Data-sets in Big Data Archives. In: Mondal, A., Gupta, H., Srivastava, J., Reddy, P., Somayajulu, D. (eds) Big Data Analytics. BDA 2018. Lecture Notes in Computer Science(), vol 11297. Springer, Cham. https://doi.org/10.1007/978-3-030-04780-1_15
Download citation
DOI: https://doi.org/10.1007/978-3-030-04780-1_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-04779-5
Online ISBN: 978-3-030-04780-1
eBook Packages: Computer ScienceComputer Science (R0)