skip to main content
chapter

The BigDAWG polystore system

Published:01 December 2018Publication History
First page image

References

  1. D. Abadi, Y. Ahmad, M. Balazinska, U. Çetintemel, M. Cherniack, J.-H. Hwang, W. Lindner, A. Maskey, A. Rasin, E. Ryvkina, N. Tatbul, Y. Xing, and S. Zdonik. 2005. The design of the Borealis stream processing engine. Proc. of the 2nd Biennial Conference on Innovative Data Systems Research (CIDR'05), Asilomar, CA, January.Google ScholarGoogle Scholar
  2. Z. Abedjan, L. Golab, and F. Naumann. August 2015. Profiling relational data: a survey. The VLDB Journal, 24(4): 557-581. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. ACM. 2015a. Announcement: Michael Stonebraker, Pioneer in Database Systems Architecture, Receives 2014 ACM Turing Award. http://amturing.acm.org/award_winners/stonebraker_1172121.cfm. Accessed February 5, 2018.Google ScholarGoogle Scholar
  4. ACM. March 2015b. Press Release: MIT's Stonebraker Brought Relational Database Systems from Concept to Commercial Success, Set the Research Agenda for the Multibillion-Dollar Database Field for Decades. http://sigmodrecord.org/publications/sigmodRecord/1503/pdfs/04_announcements_Stonebraker.pdf. Accessed February 5, 2018.Google ScholarGoogle Scholar
  5. ACM. 2016. A.M. Turing Award Citation and Biography. http://amturing.acm.org/award_winners/stonebraker_1172121.cfm. Accessed September 24, 2018.Google ScholarGoogle Scholar
  6. Y. Ahmad, B. Berg, U. Çetintemel, M. Humphrey, J. Hwang, A. Jhingran, A. Maskey, O. Papaemmanouil, A. Rasin, N. Tatbul, W. Xing, Y. Xing, and S. Zdonik. June 2005. Distributed operation in the Borealis Stream Processing Engine. Demonstration, ACM SIGMOD International Conference on Management of Data (SIGMOD'05). Baltimore, MD. Best Demonstration Award. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. M. M. Astrahan, M.W. Blasgen, D. D. Chamberlin, K. P. Eswaran, J. N. Gray, P. P. Griffiths, W. F. King, R. A. Lorie, P. R. McJones, J. W. Mehl, G. R. Putzolu, I. L. Traiger, B. W. Wade, and V. Watson. 1976. System R: relational approach to database management. ACM Transactions on Database Systems, 1(2): 97-137. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. P. Bailis, E. Gan, S. Madden, D. Narayanan, K. Rong, and S. Suri. 2017. Macrobase: Prioritizing attention in fast data. Proc. of the 2017 ACM International Conference on Management of Data. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Berkeley Software Distribution. n.d. In Wikipedia. http://en.wikipedia.org/wiki/Berkeley_Software_Distribution. Last accessed March 1, 2018.Google ScholarGoogle Scholar
  10. G. Beskales, I.F. Ilyas, L. Golab, and A. Galiullin. 2013. On the relative trust between inconsistent data and inaccurate constraints. Proc. of the IEEE International Conference on Data Engineering, ICDE 2013, pp. 541-552. Australia. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. L. S. Blackford, J. Choi, A. Cleary, E. D'Azevedo, J. Demmel, I. Dhillon, J. Dongarra, S. Hammarling, G. Henry, A. Petitet, K. Stanley, D. Walker, R. C. Whaley. 2017. ScaLAPACK Users' Guide. Society for Industrial and Applied Mathematics http://netlib.org/scalapack/slug/index.html. Last accessed December 31, 2017.Google ScholarGoogle Scholar
  12. D. Bitton, D. J. DeWitt, and C. Turbyfill. 1983. Benchmarking database systems--a systematic approach. Computer Sciences Technical Report #526, University of Wisconsin. http://minds.wisconsin.edu/handle/1793/58490.Google ScholarGoogle Scholar
  13. P. A. Boncz, M. L. Kersten, and S. Manegold. December 2008. Breaking the memory wall in MonetDB. Communications of the ACM, 51(12): 77-85. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. M. L. Brodie. June 2015. Understanding data science: an emerging discipline for data-intensive discovery. In S. Cutt, editor, Getting Data Right: Tackling the Challenges of Big Data Volume and Variety. O'Reilly Media, Sebastopol, CA.Google ScholarGoogle Scholar
  15. Brown University, Department of Computer Science. Fall 2002. Next generation stream-based applications. Conduit Magazine, 11(2). https://cs.brown.edu/about/conduit/conduit_v11n2.pdf. Last accessed May 14, 2018.Google ScholarGoogle Scholar
  16. BSD licenses. n.d. In Wikipedia. http://en.wikipedia.org/wiki/BSD_licenses. Last accessed March 1, 2018.Google ScholarGoogle Scholar
  17. M. Cafarella and C. Ré. April 2018. The last decade of database research and its blindingly bright future. or Database Research: A love song. DAWN Project, Stanford University. http://dawn.cs.stanford.edu/2018/04/11/db-community/.Google ScholarGoogle Scholar
  18. M. J. Carey, D. J. DeWitt, M. J. Franklin, N. E Hall, M. L. McAuliffe, J. F. Naughton, D. T. Schuh, M. H. Solomon, C. K. Tan, O. G. Tsatalos, S. J. White, and M. J. Zwilling. 1994. Shoring up persistent applications. Proc. of the 1994 ACM SIGMOD international conference on Management of data (SIGMOD '94), 383-394. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. M. J. Carey, D. J. Dewitt, M. J. Franklin, N. E. Hall, M. L. McAuliffe, J. F. Naughton, D. T. Schuh, M. H. Solomon, C. K. Tan, O. G. Tsatalos, S. J. White, and M. J. Zwilling. 1994. Shoring up persistent applications. In Proc. of the 1994 ACM SIGMOD International Conference on Management of Data (SIGMOD '94), pp. 383-394. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. M. J. Carey, L. M. Haas, P. M. Schwarz, M. Arya, W. E. Cody, R. Fagin, M. Flickner, A. W. Luniewski, W. Niblack, and D. Petkovic. 1995. Towards heterogeneous multimedia information systems: The garlic approach. In Research Issues in Data Engineering, 1995: Distributed Object Management, Proceedings, pp. 124-131. IEEE. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. CERN. http://home.cern/about/computing. Last accessed December 31, 2017.Google ScholarGoogle Scholar
  22. D. D. Chamberlin and R. F. Boyce. 1974. SEQUEL: A structured English query language. In Proc. of the 1974 ACM SIGFIDET (now SIGMOD) Workshop on Data Description, Access and Control (SIGFIDET '74), pp. 249-264. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. D. D. Chamberlin, M. M. Astrahan, K. P. Eswaran, P. P. Griffiths, R. A. Lorie, J. W. Mehl, P. Reisner, and B. W. Wade. 1976. SEQUEL 2: a unified approach to data definition, manipulation, and control. IBM Journal of Research and Development, 20(6): 560-575. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. S. Chandrasekaran, O, Cooper, A. Deshpande, M.J. Franklin, J.M. Hellerstein, W. Hong, S. Krishnamurthy, S. Madden, V. Raman, F. Reiss, and M. Shah. 2003. TelegraphCQ: Continuous dataflow processing for an uncertain world. Proc. of the 2003 ACM SIGMOD International Conference on Management of Data (SIGMOD '03), pp. 668-668. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. J. Chen, D.J. DeWitt, F. Tian, and Y. Wang. 2000. NiagaraCQ: A scalable continuous query system for Internet databases. Proc. of the 2000 ACM SIGMOD International Conference on Management of Data (SIGMOD '00), pp. 379-390. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. M. Cherniack, H. Balakrishnan, M. Balazinska, D. Carney, U. Çetintemel, Y. Xing, and S. Zdonik. 2003. Scalable distributed stream processing. Proc. of the First Biennial Conference on Innovative Database Systems (CIDR'03), Asilomar, CA, January.Google ScholarGoogle Scholar
  27. C. M. Christensen. 1997. The Innovator's Dilemma: When New Technologies Cause Great Firms to Fail. Harvard Business School Press, Boston, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. X. Chu, I. F. Ilyas, and P. Papotti. 2013a. Holistic data cleaning: Putting violations into context. Proc. of the IEEE International Conference on Data Engineering, ICDE 2013, pp. 458-469. Australia. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. X. Chu, I. F. Ilyas, and P. Papotti. 2013b. Discovering denial constraints. Proc. of the VLDB Endowment, PVLDB 6(13): 1498-1509. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. X. Chu, J. Morcos, I. F. Ilyas, M. Ouzzani, P. Papotti, N. Tang, and Y. Ye. 2015. Katara: A data cleaning system powered by knowledge bases and crowdsourcing. In Proc. of the 2015 ACM SIGMOD International Conference on Management of Data (SIGMOD '15), pp. 1247-1261. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. P. J. A. Cock, C. J. Fields, N. Goto, M. L. Heuer, and P. M. Rice. 2009. The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Research 38.6: 1767-1771.Google ScholarGoogle ScholarCross RefCross Ref
  32. E. F. Codd. June 1970. A relational model of data for large shared data banks. Communications of the ACM, 13(6): 377-387. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. M. Collins. 2016. Thomson Reuters uses Tamr to deliver better connected content at a fraction of the time and cost of legacy approaches. Tamr blog, July 28. https://www.tamr.com/video/thomson-reuters-uses-tamr-deliver-better-connected-content-fraction-time-cost-legacy-approaches/. Last accessed January 24, 2018.Google ScholarGoogle Scholar
  34. G. Copeland and D. Maier. 1984. Making smalltalk a database system. Proc. of the 1984 ACM SIGMOD International Conference on Management of Data (SIGMOD '84), pp. 316-325. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. C. Cranor, T. Johnson, V. Shkapenyuk, and O. Spatscheck. 2003. Gigascope: A stream database for network applications. Proc. of the 2003 ACM SIGMOD International Conference on Management of Data (SIGMOD '03), pp. 647-651. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. A. Crotty, A. Galakatos, K. Dursun, T. Kraska, U. Cetintemel, and S. Zdonik. 2015. Tupleware: "Big Data, Big Analytics, Small Clusters." CIDR.Google ScholarGoogle Scholar
  37. M. Dallachiesa, A. Ebaid, A. Eldawi, A. Elmagarmid, I. F. Ilyas, M. Ouzzani, and N. Tang. 2013. NADEEF, a commodity data cleaning system. Proc. of the 2013 ACM SIGMOD Conference on Management of Data, pp. 541-552. New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. T. Dasu and J. M. Loh. 2012. Statistical distortion: Consequences of data cleaning. PVLDB, 5(11): 1674-1683. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. C. J. Date and E. F. Codd. 1975. The relational and network approaches: Comparison of the application programming interfaces. In Proc. of the 1974 ACM SIGFIDET (now SIGMOD) Workshop on Data Description, Access and Control: Data Models: Data-Structure-Set Versus Relational (SIGFIDET '74), pp. 83-113. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. D. J. DeWitt. 1979a. Direct a multiprocessor organization for supporting relational database management systems. IEEE Transactions of Computers, 28(6), 395-406. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. D. J. DeWitt. 1979b. Query execution in DIRECT. In Proc. of the 1979 ACM SIGMOD International Conference on Management of Data (SIGMOD '79), pp. 13-22. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. D. J. DeWitt, R. H. Gerber, G. Graefe, M. L. Heytens, K. B. Kumar, and M. Muralikrishna. 1986. GAMMA--a high performance dataflow database machine. Proc. of the 12th International Conference on Very Large Data Bases (VLDB '86), W. W. Chu, G. Gardarin, S. Ohsuga, and Y. Kambayashi, editors, pp. 228-237. Morgan Kaufmann Publishers Inc., San Francisco, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. D. J. DeWitt, S. Ghandeharizadeh, D. A. Schneider, A. Bricker, H.-I. Hsiao, and R. Rasmussen. March 1990. The Gamma database machine project. IEEE Transactions on Knowledge and Data Engineering, 2(1): 44-62. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. D. DeWitt and J. Gray. June 1992. Parallel database systems: the future of high performance database systems. Communications of the ACM, 35(6): 85-98. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. D. J. DeWitt, A. Halverson, R. Nehme, S. Shankar, J. Aguilar-Saborit, A. Avanes, M. Flasza, and J. Gramling. 2013. Split query processing in polybase. Proc. of the 2013 ACM SIGMOD International Conference on Management of Data (SIGMOD '13), pp. 1255-1266. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. C. Diaconu, C. Freedman, E. Ismert, P-A. Larson, P. Mittal, R. Stonecipher, N. Verma, and M. Zwilling. 2013. Hekaton: SQL server's memory-optimized OLTP engine. In Proc. of the 2013 ACM SIGMOD International Conference on Management of Data (SIGMOD '13), pp. 1243-1254. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. K. P. Eswaran, J. N. Gray, R. A. Lorie, and I. L. Traiger. November 1976. The notions of consistency and predicate locks in a database system. Communications of the ACM, 19(11): 624-633. Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. W. Fan, J. Li, S. Ma, N. Tang, and W. Yu. April 2012. Towards certain fixes with editing rules and master data. The VLDB Journal, 21(2): 213-238. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. D. Fogg. September 1982. Implementation of domain abstraction in the relational database system INGRES. Master of Science Report, Dept. of Electrical Engineering and Computer Sciences, University of California, Berkeley, CA.Google ScholarGoogle Scholar
  50. T. Flory, A. Robbin, and M. David. May 1988. Creating SIPP longitudinal analysis files using a relational database management system. CDE Working Paper No. 88-32, Institute for Research on Poverty, University of Wisconsin-Madison, Madison, WI.Google ScholarGoogle Scholar
  51. V. Gadepally, J. Kepner, W. Arcand, D. Bestor, B. Bergeron, C. Byun, L. Edwards, M. Hubbell, P. Michaleas, J. Mullen, A. Prout, A. Rosa, C. Yee, and A. Reuther. 2015. D4M: Bringing associative arrays to database engines. High Performance Extreme Computing Conference (HPEC). IEEE, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  52. V. Gadepally, K. O'Brien, A. Dziedzic, A. Elmore, J. Kepner, S. Madden, T. Mattson, J. Rogers, Z. She, and M. Stonebraker. September 2017. BigDAWG Version 0.1. IEEE High Performance Extreme.Google ScholarGoogle Scholar
  53. J. Gantz and D. Reinsel. 2013. The Digital Universe in 2020: Big Data, Bigger Digital Shadows, and Biggest Growth in the Far East--United States, IDC, February.Google ScholarGoogle Scholar
  54. L. Gerhardt, C. H. Faham, and Y. Yao. 2015. Accelerating scientific analysis with SciDB. Journal of Physics: Conference Series, 664(7).Google ScholarGoogle ScholarCross RefCross Ref
  55. B. Grad. 2007. Oral history of Michael Stonebraker, Transcription. Recorded: August 23, 2007. Computer History Museum, Moultonborough, NH. http://archive.computerhistory.org/resources/access/text/2012/12/102635858-05-01-acc.pdf. Last accessed April 8, 2018.Google ScholarGoogle Scholar
  56. A. Guttman. 1984. R-trees: a dynamic index structure for spatial searching. In Proc. of the 1984 ACM SIGMOD International Conference on Management of Data (SIGMOD '84), pp. 47-57. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. L. M. Haas, J. C. Freytag, G. M. Lohman, and H. Pirahesh. 1989. Extensible query processing in starburst. In Proc. of the 1989 ACM SIGMOD International Conference on Management of Data (SIGMOD '89), pp. 377-388. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. D. Halperin, V. Teixeira de Almeida, L. L. Choo, S. Chu, P. Koutris, D. Moritz, J. Ortiz, V. Ruamviboonsuk, J. Wang, A. Whitaker. 2014. Demonstration of the Myria big data management service. Proc. of the 2014 ACM SIGMOD International Conference on Management of Data (SIGMOD '14), p. 881-884. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. B. Haynes, A. Cheung, and M. Balazinska. 2016. PipeGen: Data pipe generator for hybrid analytics. Proc. of the Seventh ACM Symposium on Cloud Computing (SoCC '16), M. K. Aguilera, B. Cooper, and Y. Diao, editors, pp. 470-483. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. M. A. Hearst. 2009. Search user interfaces. Cambridge University Press, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. J. M. Hellerstein, J. F. Naughton, and A. Pfeffer. 1995. Generalized search trees for database systems. In Proc. of the 21th International Conference on Very Large Data Bases (VLDB '95), pp. 562-573. Morgan Kaufmann Publishers Inc., San Francisco, CA. http://dl.acm.org/citation.cfm?id=645921.673145. Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. J. M. Hellerstein, E. Koutsoupias, D. P. Miranker, C. H. Papadimitriou, V. Samoladas. 2002. On a model of indexability and its bounds for range queries, Journal of the ACM (JACM), 49.1: 35-55. Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. IBM. 1997. Special Issue on IBM's S/390 Parallel Sysplex Cluster. IBM Systems Journal, 36(2).Google ScholarGoogle Scholar
  64. S. Idreos, F. Groffen, N. Nes, S. Manegold, S. K. Mullender, and M. L. Kersten. 2012. MonetDB: two decades of research in column-oriented database architectures. IEEE Data Engineering Bulletin, 35(1): 40-45.Google ScholarGoogle Scholar
  65. N. Jain, S. Mishra, A. Srinivasan, J. Gehrke, J. Widom, H. Balakrishnan, U. Çetintemel, M. Cherniack, R. Tibbetts, and S. Zdonik. 2008. Towards a streaming SQL standard. Proc. VLDB Endowment, pp. 1379-1390. August 1-2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. A. E. W. Johnson, T. J. Pollard, L. Shen, L. H. Lehman, M. Feng, M. Ghassemi, B. E. Moody, P. Szolovits, L. A. G. Celi, and R. G. Mark. 2016. MIMIC-III, a freely accessible critical care database. Scientific Data 3: 160035Google ScholarGoogle Scholar
  67. V. Josifovski, P. Schwarz, L. Haas, and E. Lin. 2002. Garlic: a new flavor of federated query processing for DB2. In Proc. of the 2002 ACM SIGMOD International Conference on Management of Data (SIGMOD '02), pp. 524-532. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. J. W. Josten, C. Mohan, I. Narang, and J. Z. Teng. 1997. DB2's use of the coupling facility for data sharing. IBM Systems Journal, 36(2): 327-351. Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. S. Kandel, A. Paepcke, J. Hellerstein, and J. Heer. 2011. Wrangler: Interactive visual specification of data transformation scripts. In Proc. of the SIGCHI Conference on Human Factors in Computing Systems (CHI '11), pp. 3363-3372. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. R. Katz. editor. June 1982. Special issue on design data management. IEEE Database Engineering Newsletter, 5(2).Google ScholarGoogle Scholar
  71. J. Kepner, V. Gadepally, D. Hutchison, H. Jensen, T. Mattson, S. Samsi, and A. Reuther. 2016. Associative array model of SQL, NoSQL, and NewSQL Databases. IEEE High Performance Extreme Computing Conference (HPEC) 2016, Waltham, MA, September 13-15.Google ScholarGoogle ScholarCross RefCross Ref
  72. V. Kevin and M. Whitney. 1974. Relational data management implementation techniques. In Proc. of the 1974 ACM SIGFIDET (now SIGMOD) Workshop on Data Description, Access and Control (SIGFIDET '74), pp. 321-350. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. Z. Khayyat, I.F. Ilyas, A. Jindal, S. Madden, M. Ouzzani, P. Papotti, J.-A. Quiané-Ruiz, N. Tang, and S. Yin. 2015. Bigdansing: A system for big data cleansing. In Proc. of the 2015 ACM SIGMOD International Conference on Management of Data (SIGMOD '15), pp. 1215-1230. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  74. R. Kimball and M. Ross. 2013. The Data Warehouse Toolkit. John Wiley & Sons, Inc. https://www.kimballgroup.com/data-warehouse-business-intelligence-resources/books/. Last accessed March 2, 2018.Google ScholarGoogle Scholar
  75. M. Kornacker, C. Mohan, and J.M. Hellerstein. 1997. Concurrency and recovery in generalized search trees. In Proc. of the 1997 ACM SIGMOD International Conference on Management of Data (SIGMOD '97), pp. 62-72. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. A. Lamb, M. Fuller, R. Varadarajan, N. Tran, B. Vandiver, L. Doshi, and C. Bear. August 2012. The Vertica Analytic Database: C-Store 7 years later. Proc. VLDB Endowment, 5(12): 1790-1801. Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. L. Lamport. 2001. Paxos Made Simple. http://lamport.azurewebsites.net/pubs/paxos-simple.pdf. Last accessed December 31, 2017.Google ScholarGoogle Scholar
  78. D. Laney. 2001. 3D data management: controlling data volume, variety and velocity. META Group Research, February 6. https://blogs.gartner.com/doug-laney/files/2012/01/ad949-3D-Data-Management-Controlling-Data-Volume-Velocity-and-Variety.pdf. Last accessed April 22, 2018.Google ScholarGoogle Scholar
  79. P-A. Larson, C. Clinciu, E.N. Hanson, A. Oks, S.L. Price, S. Rangarajan, A. Surna, and Q. Zhou. 2011. SQL server column store indexes. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data (SIGMOD '11), pp. 1177-1184. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  80. J. LeFevre, J. Sankaranarayanan, H. Hacigumus, J. Tatemura, N. Polyzotis, and M. J. Carey. 2014. MISO: Souping up big data query processing with a multistore system. Proc. of the 2014 ACM SIGMOD International Conference on Management of Data (SIGMOD '14), pp. 1591-1602. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  81. B. G. Lindsay. 1987. A retrospective of R*: a distributed database management system. In Proc. of the IEEE, 75(5): 668-673.Google ScholarGoogle ScholarCross RefCross Ref
  82. B. Liskov and S.N. Zilles. 1974. Programming with abstract data types. SIGPLAN Notices, 9(4): 50-59. Google ScholarGoogle ScholarDigital LibraryDigital Library
  83. S. Marcin and A. Csillaghy. 2016. Running scientific algorithms as array database operators: Bringing the processing power to the data. 2016 IEEE International Conference on Big Data. pp. 3187-3193.Google ScholarGoogle Scholar
  84. T. Mattson, V. Gadepally, Z. She, A. Dziedzic, and J. Parkhurst. 2017. Demonstrating the BigDAWG polystore system for ocean metagenomic analysis. CIDR'17 Chaminade, CA. http://cidrdb.org/cidr2017/papers/p120-mattson-cidr17.pdf.Google ScholarGoogle Scholar
  85. J. Meehan, C. Aslantas, S. Zdonik, N. Tatbul, and J. Du. 2017. Data ingestion for the connected world. Conference on Innovative Data Systems Research (CIDR'17), Chaminade, CA, January.Google ScholarGoogle Scholar
  86. A. Metaxides, W. B. Helgeson, R. E. Seth, G. C. Bryson, M. A. Coane, D. G. Dodd, C. P. Earnest, R. W. Engles, L. N. Harper, P. A. Hartley, D. J. Hopkin, J. D. Joyce, S. C. Knapp, J. R. Lucking, J. M. Muro, M. P. Persily, M. A. Ramm, J. F. Russell, R. F. Schubert, J. R. Sidlo, M. M. Smith, and G. T. Werner. April 1971. Data Base Task Group Report to the CODASYL Programming Language Committee. ACM, New York. Google ScholarGoogle Scholar
  87. C. Mohan, D. Haderle, B. Lindsay, H. Pirahesh, and P. Schwarz. 1992. ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging. ACM Transactions on Database Systems, 17(1), 94-162. Google ScholarGoogle ScholarDigital LibraryDigital Library
  88. R. Motwani, J. Widom, A. Arasu B. Babcock, S. Babu, M. Datar, G. Manku, C. Olston, J. Rosenstein, and R. Varma. 2003. Query processing, approximation, and resource management in a data stream management system. Proc. of the First Biennial Conference on Innovative Data Systems Research (CIDR), January.Google ScholarGoogle Scholar
  89. A. Oloso, K-S Kuo, T. Clune, P. Brown, A. Poliakov, H. Yu. 2016. Implementing connected component labeling as a user defined operator for SciDB. Proc. of 2016 IEEE International Conference on Big Data (Big Data). Washington, DC.Google ScholarGoogle ScholarCross RefCross Ref
  90. M. A. Olson. 1993. The design and implementation of the inversion file system. USENIX Winter. http://www.usenix.org/conference/usenix-winter-1993-conference/presentation/design-and-implementation-inversion-file-syste. Last accessed January 22, 2018.Google ScholarGoogle Scholar
  91. J. C. Ong. 1982. Implementation of abstract data types in the relational database system INGRES, Master of Science Report, Dept. of Electrical Engineering and Computer Sciences, University of California, Berkeley, CA, September 1982.Google ScholarGoogle Scholar
  92. A. Palmer. 2013. Culture matters: Facebook CIO talks about how well Vertica, Facebook people mesh. Koa Labs Blog, December 20. http://koablog.wordpress.com/2013/12/20/culture-matters-facebook-cio-talks-about-how-well-vertica-facebook-people-mesh. Last accessed March 14, 2018.Google ScholarGoogle Scholar
  93. A. Palmer. 2015a. The simple truth: happy people, healthy company. Tamr Blog, March 23. http://www.tamr.com/the-simple-truth-happy-people-healthy-company/. Last accessed March 14, 2018.Google ScholarGoogle Scholar
  94. A. Palmer. 2015b. Where the red book meets the unicorn, Xconomy, June 22. http://www.xconomy.com/boston/2015/06/22/where-the-red-book-meets-the-unicorn/ Last accessed March 14, 2018.Google ScholarGoogle Scholar
  95. A. Pavlo and M. Aslett. September 2016. What's really new with NewSQL? ACM SIGMOD Record, 45(2): 45-55. Google ScholarGoogle ScholarDigital LibraryDigital Library
  96. G. Press. 2016. Cleaning big data: most time-consuming, least enjoyable data science task, survey says. Forbes, May 23. https://www.forbes.com/sites/gilpress/2016/03/23/data-preparation-most-time-consuming-least-enjoyable-data-science-task-survey-says/#79e14e326f63.Google ScholarGoogle Scholar
  97. N. Prokoshyna, J. Szlichta, F. Chiang, R. J. Miller, and D. Srivastava. 2015. Combining quantitative and logical data cleaning. PVLDB, 9(4): 300-311. Google ScholarGoogle ScholarDigital LibraryDigital Library
  98. E. Ryvkina, A. S. Maskey, M. Cherniack, and S. Zdonik. 2006. Revision processing in a stream processing engine: a high-level design. Proc. of the 22nd International Conference on Data Engineering (ICDE'06), pp. 141-. Atlanta, GA, April. IEEE Computer Society, Washington, DC. Google ScholarGoogle ScholarDigital LibraryDigital Library
  99. C. Saracco and D. Haderle. 2013. The history and growth of IBM's DB2. IEEE Annals of the History of Computing, 35(2): 54-66. Google ScholarGoogle ScholarDigital LibraryDigital Library
  100. N. Savage. May 2015. Forging relationships. Communications of the ACM, 58(6): 22-23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  101. M. C. Schatz and B. Langmead. 2013. The DNA data deluge. IEEE Spectrum Magazine. https://spectrum.ieee.org/biomedical/devices/the-dna-data-deluge.Google ScholarGoogle Scholar
  102. Z. She, S. Ravishankar, and J. Duggan. 2016. BigDAWG polystore query optimization through semantic equivalences. High Performance Extreme Computing Conference (HPEC). IEEE, 2016.Google ScholarGoogle Scholar
  103. SIGFIDET panel discussion. 1974. In Proc. of the 1974 ACM SIGFIDET (now SIGMOD) Workshop on Data Description, Access and Control: Data Models: Data-Structure-Set Versus Relational (SIGFIDET '74), pp. 121-144. ACM, New York.Google ScholarGoogle Scholar
  104. R. Snodgrass. December 1982. Monitoring distributed systems: a relational approach. Ph.D. Dissertation, Computer Science Department, Carnegie Mellon University, Pittsburgh, PA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  105. A. Szalay. June 2008. The Sloan digital sky survey and beyond. ACM SIGMOD Record, 37(2): 61-66. Google ScholarGoogle ScholarDigital LibraryDigital Library
  106. Tamr. 2017. Tamr awarded patent for enterprise-scale data unification system. Tamr blog. February 9 2017. https://www.tamr.com/tamr-awarded-patent-enterprise-scale-data-unification-system-2/. Last accessed January 24, 2018.Google ScholarGoogle Scholar
  107. R. Tan, R. Chirkova, V. Gadepally, and T. Mattson. 2017. Enabling query processing across heterogeneous data models: A survey. IEEE Big Data Workshop: Methods to Manage Heterogeneous Big Data and Polystore Databases, Boston, MA.Google ScholarGoogle Scholar
  108. N. Tatbul and S. Zdonik. 2006. Window-aware Load Shedding for Aggregation Queries over Data Streams. In Proc. of the 32nd International Conference on Very Large Databases (VLDB'06), Seoul, Korea. Google ScholarGoogle ScholarDigital LibraryDigital Library
  109. N. Tatbul, U. Çetintemel, and S. Zdonik. 2007. "Staying FIT: Efficient Load Shedding Techniques for Distributed Stream Processing." International Conference on Very Large Data Bases (VLDB'07), Vienna, Austria. Google ScholarGoogle ScholarDigital LibraryDigital Library
  110. R. P. van de Riet. 1986. Expert database systems. In Future Generation Computer Systems, 2(3): 191-199,Google ScholarGoogle ScholarDigital LibraryDigital Library
  111. M. Vartak, S. Rahman, S. Madden, A. Parameswaran, and N. Polyzotis. September 2015. Seedb: Efficient data-driven visualization recommendations to support visual analytics. PVLDB, 8(13): 2182-2193. Google ScholarGoogle ScholarDigital LibraryDigital Library
  112. B. Wallace. June 9, 1986. Data base tool links to remote sites. Network World. http://books.google.com/books?id=aBwEAAAAMBAJ&pg=PA49&lpg=PA49&dq=ingres+star&source=bl&ots=FSMIR4thMj&sig=S1fzaaOT5CHRq4cwbLFEQp4UYCs&hl=en&sa=X&ved=0ahUKEwjJ1J_NttvZAhUG82MKHco2CfAQ6AEIYzAP#v=onepage&q=ingres%20star&f=false. Last accessed March 14, 2018.Google ScholarGoogle Scholar
  113. J. Wang and N. J. Tang. 2014. Towards dependable data repairing with fixing rules. In Proc. of the 2014 ACM SIGMOD International Conference on Management of Data (SIGMOD '14), pp. 457-468. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  114. E. Wong and K. Youssefi. September 1976. Decomposition--a strategy for query processing. ACM Transactions on Database Systems, 1(3): 223-241. Google ScholarGoogle ScholarDigital LibraryDigital Library
  115. E. Wu and S. Madden. 2013. Scorpion: Explaining away outliers in aggregate queries. PVLDB, 6(8): 553-564. Google ScholarGoogle ScholarDigital LibraryDigital Library
  116. Y. Xing, S. Zdonik, and J.-H. Hwang. April 2005. Dynamic load distribution in the Borealis Stream Processor. Proc. of the 21st International Conference on Data Engineering (ICDE'05), Tokyo, Japan. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. The BigDAWG polystore system
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Books
        Making Databases Work: the Pragmatic Wisdom of Michael Stonebraker
        December 2018
        725 pages
        ISBN:9781947487192
        DOI:10.1145/3226595

        Publisher

        Association for Computing Machinery and Morgan & Claypool

        Publication History

        • Published: 1 December 2018

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • chapter

        Appears In

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader