Skip to main content

FastDAWG: Improving Data Migration in the BigDAWG Polystore System

  • Conference paper
  • First Online:
Heterogeneous Data Management, Polystores, and Analytics for Healthcare (DMAH 2018, Poly 2018)

Abstract

The problem of data integration has been around for decades, yet a satisfactory solution has not yet emerged. A new type of system called a polystore has surfaced to partially address the integration problem. Based on experience with our own polystore called BigDAWG, we identify three major roadblocks to an acceptable commercial solution. We offer a new architecture inspired by these three problems that trades some generality for usability. This architecture also exploits modern hardware (i.e., high-speed networks and RDMA) to gain performance. The paper concludes with some promising experimental results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Innova-2 Flex Programmable Network Adapter (2018). https://goo.gl/xNzVD1

  2. Mellanox BlueField SmartNIC (2018). https://goo.gl/dic6HH

  3. Binnig, C., Crotty, A., Galakatos, A., Kraska, T., Zamanian, E.: The end of slow networks: it’s time for a redesign. Proc. VLDB Endow. 9(7), 528–539 (2016)

    Article  Google Scholar 

  4. Chen, P., Gadepally, V., Stonebraker, M.: The BigDAWG monitoring framework. In: 2016 IEEE High Performance Extreme Computing Conference (HPEC), pp. 1–6. IEEE (2016)

    Google Scholar 

  5. Duggan, J., et al.: The BigDAWG polystore system. ACM SIGMOD Rec. 44(2), 11–16 (2015)

    Article  Google Scholar 

  6. Dziedzic, A., Elmore, A.J., Stonebraker, M.: Data transformation and migration in polystores. In: 2016 IEEE High Performance Extreme Computing Conference (HPEC), pp. 1–6. IEEE (2016)

    Google Scholar 

  7. Elmore, A., et al.: A demonstration of the BigDAWG polystore system. Proc. VLDB Endow. 8(12), 1908–1911 (2015)

    Article  Google Scholar 

  8. Gadepally, V., et al.: BigDAWG version 0.1. In: 2017 IEEE High Performance Extreme Computing Conference (HPEC), pp. 1–7. IEEE (2017)

    Google Scholar 

  9. Gupta, A.M., Gadepally, V., Stonebraker, M.: Cross-engine query execution in federated database systems. In: 2016 IEEE High Performance Extreme Computing Conference (HPEC), pp. 1–6. IEEE (2016)

    Google Scholar 

  10. Hammer, M., McLeod, D.: On database management system architecture. Technical report, Massachusetts Institute of Technology Cambridge Lab for Computer Science (1979)

    Google Scholar 

  11. Hausenblas, M., Nadeau, J.: Apache drill: interactive ad-hoc analysis at scale. Big Data 1(2), 100–104 (2013)

    Article  Google Scholar 

  12. Kolev, B., et al.: Design and implementation of the CloudMdsQL multistore system. In: CLOSER: Cloud Computing and Services Science, vol. 1, pp. 352–359 (2016)

    Google Scholar 

  13. McLeod, D., Heimbigner, D.: A federated architecture for database systems. In: Proceedings of the National Computer Conference, 19–22 May 1980, pp. 283–289. ACM (1980)

    Google Scholar 

  14. She, Z., Ravishankar, S., Duggan, J.: BigDAWG polystore query optimization through semantic equivalences. In: 2016 IEEE High Performance Extreme Computing Conference (HPEC), pp. 1–6. IEEE (2016)

    Google Scholar 

  15. Sheth, A.P., Larson, J.A.: Federated database systems for managing distributed, heterogeneous, and autonomous databases. ACM Comput. Surv. (CSUR) 22(3), 183–236 (1990)

    Article  Google Scholar 

  16. Stonebraker, M., Rowe, L.A.: The Design of Postgres, vol. 15. ACM, New York City (1986)

    Book  Google Scholar 

  17. Tan, R., Chirkova, R., Gadepally, V., Mattson, T.G.: Enabling query processing across heterogeneous data models: a survey. In: 2017 IEEE International Conference on Big Data (Big Data), pp. 3211–3220. IEEE (2017)

    Google Scholar 

  18. Wang, J., et al.: The Myria big data management and analytics system and cloud services. In: CIDR (2017)

    Google Scholar 

  19. Yu, K., Gadepally, V., Stonebraker, M.: Database engine integration and performance analysis of the BigDAWG polystore system. In: 2017 IEEE High Performance Extreme Computing Conference (HPEC), pp. 1–7. IEEE (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiangyao Yu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yu, X., Gadepally, V., Zdonik, S., Kraska, T., Stonebraker, M. (2019). FastDAWG: Improving Data Migration in the BigDAWG Polystore System. In: Gadepally, V., Mattson, T., Stonebraker, M., Wang, F., Luo, G., Teodoro, G. (eds) Heterogeneous Data Management, Polystores, and Analytics for Healthcare. DMAH Poly 2018 2018. Lecture Notes in Computer Science(), vol 11470. Springer, Cham. https://doi.org/10.1007/978-3-030-14177-6_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-14177-6_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-14176-9

  • Online ISBN: 978-3-030-14177-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics