skip to main content
review-article

Report from the third workshop on Algorithms and Systems for MapReduce and Beyond (BeyondMR'16)

Published:01 September 2017Publication History
Skip Abstract Section

Abstract

This report summarizes the presentations and discussions of the third workshop on Algorithms and Systems for MapReduce and Beyond (BeyondMR'16). The BeyondMR workshop was held in conjunction with the 2016 SIGMOD conference in San Francisco, California, USA on July 1, 2016. The goal of the workshop was to bring together researchers and practitioners to explore algorithms, computational models, architectures, languages and interfaces for systems that need largescale parallelization and systems designed to support efficient parallelization and fault tolerance. These include specialized programming and data-management systems based on MapReduce and extensions, graph processing systems, data-intensive workflow and dataflow systems. The program featured two very well attended invited talks by Ion Stoica from AMPLab, University of California Berkeley and Carlos Guestrin from the University of Washington.

References

  1. Foto N. Afrati, Jacek Sroka, and Jan Hidders, editors. Proceedings of the 3rd ACM SIGMOD Workshop on Algorithms and Systems for MapReduce and Beyond, BeyondMR@SIGMOD 2016, San Francisco, CA, USA, July 1, 2016. ACM, 2016. http://doi.acm.org/10.1145/2926534.Google ScholarGoogle Scholar
  2. Michael Armbrust, Reynold S. Xin, Cheng Lian, Yin Huai, Davies Liu, Joseph K. Bradley, Xiangrui Meng, Tomer Kaftan, Michael J. Franklin, Ali Ghodsi, and Matei Zaharia. Spark SQL: relational data processing in spark. In Timos K. Sellis, Susan B. Davidson, and Zachary G. Ives, editors, Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31 - June 4, 2015, pages 1383--1394. ACM, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Tianqi Chen and Carlos Guestrin. Xgboost: A scalable tree boosting system. In Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '16, pages 785--794, New York, NY, USA, 2016. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Harunobu Daikoku, Hideyuki Kawashima, and Osamu Tatebe. On exploring efficient shuffle design for in-memory mapreduce. In Afrati et al. [1], page 6. http://doi.acm.org/10.1145/2926534.2926538.Google ScholarGoogle Scholar
  5. Joseph E. Gonzalez, Reynold S. Xin, Ankur Dave, Daniel Crankshaw, Michael J. Franklin, and Ion Stoica. Graphx: Graph processing in a distributed dataflow framework. In Jason Flinn and Hank Levy, editors, 11th USENIX Symposium on Operating Systems Design and Implementation, OSDI '14, Broomfield, CO, USA, October 6-8, 2014., pages 599--613. USENIX Association, 2014.Google ScholarGoogle Scholar
  6. Gösta Grahne, Shahab Harrafi, Iraj Hedayati, and Ali Moallemi. DFA minimization in map-reduce. In Afrati et al. [1], page 4. http://doi.acm.org/10.1145/2926534.2926537.Google ScholarGoogle Scholar
  7. Michael Greenwald and Sanjeev Khanna. Space-efficient online computation of quantile summaries. In Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data, SIGMOD '01, pages 58--66, New York, NY, USA, 2001. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Paraschos Koutris and Nivetha Singara Vadivelu. Deterministic load balancing for parallel joins. In Afrati et al. [1], page 10. http://doi.acm.org/10.1145/2926534.2926536.Google ScholarGoogle Scholar
  9. Andreas Kunft, Alexander Alexandrov, Asterios Katsifodimos, and Volker Markl. Bridging the gap: towards optimization across linear and relational algebra. In Afrati et al. [1], page 1. http://doi.acm.org/10.1145/2926534.2926540.Google ScholarGoogle Scholar
  10. Andrea Lattuada, Frank McSherry, and Zaheer Chothia. Faucet: a user-level, modular technique for flow control in dataflow engines. In Afrati et al. [1], page 2. http://doi.acm.org/10.1145/2926534.2926544.Google ScholarGoogle Scholar
  11. Yucheng Low, Joseph Gonzalez, Aapo Kyrola, Danny Bickson, Carlos Guestrin, and Joseph M. Hellerstein. Graphlab: A new framework for parallel machine learning. In Peter Grünwald and Peter Spirtes, editors, UAI 2010, Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence, Catalina Island, CA, USA, July 8-11, 2010, pages 340--349. AUAI Press, 2010.Google ScholarGoogle Scholar
  12. Xiangrui Meng, Joseph K. Bradley, Burak Yavuz, Evan R. Sparks, Shivaram Venkataraman, Davies Liu, Jeremy Freeman, D. B. Tsai, Manish Amde, Sean Owen, Doris Xin, Reynold Xin, Michael J. Franklin, Reza Zadeh, Matei Zaharia, and Ameet Talwalkar. Mllib: Machine learning in apache spark. CoRR, abs/1505.06807, 2015.Google ScholarGoogle Scholar
  13. Prakash Ramanan and Ashita Nagar. Tight bounds on one- and two-pass mapreduce algorithms for matrix multiplication. In Afrati et al. [1], page 9. http://doi.acm.org/10.1145/2926534.2926542.Google ScholarGoogle Scholar
  14. Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. "why should i trust you?": Explaining the predictions of any classifier. In Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '16, pages 1135--1144, New York, NY, USA, 2016. ACM. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Anish Das Sarma, Foto N. Afrati, Semih Salihoglu, and Jeffrey D. Ullman. Upper and lower bounds on the cost of a map-reduce computation. In Proceedings of the 39th international conference on Very Large Data Bases, PVLDB'13, pages 277--288. VLDB Endowment, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Johannes Schildgen, Thomas Lottermann, and Stefan Deßloch. Cross-system NoSQL data transformations with NotaQL. In Afrati et al. [1], page 5. http://doi.acm.org/10.1145/2926534.2926535.Google ScholarGoogle Scholar
  17. Jeffrey D. Ullman and Jonathan R. Ullman. Some pairs problems. In Afrati et al. [1], page 8. http://doi.acm.org/10.1145/2926534.2926543.Google ScholarGoogle Scholar
  18. Shivaram Venkataraman, Zongheng Yang, Davies Liu, Eric Liang, Hossein Falaki, Xiangrui Meng, Reynold Xin, Ali Ghodsi, Michael J. Franklin, Ion Stoica, and Matei Zaharia. Sparkr: Scaling R programs with spark. In Fatma Özcan, Georgia Koutrika, and Sam Madden, editors, Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference 2016, San Francisco, CA, USA, June 26 - July 01, 2016, pages 1099--1104. ACM, 2016.Google ScholarGoogle Scholar
  19. Jingjing Wang and Magdalena Balazinska. Toward elastic memory management for cloud data analytics. In Afrati et al. [1], page 7. http://doi.acm.org/10.1145/2926534.2926541.Google ScholarGoogle Scholar
  20. Matei Zaharia, Tathagata Das, Haoyuan Li, Timothy Hunter, Scott Shenker, and Ion Stoica. Discretized streams: fault-tolerant streaming computation at scale. In Michael Kaminsky and Mike Dahlin, editors, ACM SIGOPS 24th Symposium on Operating Systems Principles, SOSP '13, Farmington, PA, USA, November 3-6, 2013, pages 423--438. ACM, 2013. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Matei Zaharia, Reynold S. Xin, Patrick Wendell, Tathagata Das, Michael Armbrust, Ankur Dave, Xiangrui Meng, Josh Rosen, Shivaram Venkataraman, Michael J. Franklin, Ali Ghodsi, Joseph Gonzalez, Scott Shenker, and Ion Stoica. Apache spark: a unified engine for big data processing. Commun. ACM, 59(11):56--65, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Bingjing Zhang, Bo Peng, and Judy Qiu. Model-centric computation abstractions in machine learning applications. In Afrati et al. [1], page 3. http://doi.acm.org/10.1145/2926534.2926539.Google ScholarGoogle Scholar
  23. Qi Zhang and Wei Wang. A fast algorithm for approximate quantiles in high speed data streams. In Proceedings of the 19th International Conference on Scientific and Statistical Database Management, SSDBM '07, pages 29--29, Washington, DC, USA, 2007. IEEE Computer Society. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

  • Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)0

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader