skip to main content
research-article
Free access

Hazy: making it easier to build and maintain big-data analytics

Published: 01 March 2013 Publication History

Abstract

Racing to unleash the full potential of big data with the latest statistical and machine-learning techniques.

References

[1]
Agrawal, R. and Srikant, R. Fast algorithms for mining association rules in large databases. In Proceedings of Very Large Databases, 1994.
[2]
Anderson, M., Antenucci, D., Bittorf, V., Burgess, M., Cafarella, M., Kumar, A., Niu, F., Park, Y., Ré, C. and Zhang, C. 2013. Brainwash: A data system for feature engineering. In Proceedings of Conference on Innovative Data Systems Research, 2013
[3]
Anstreicher, K.M., Wolsey, L.A. Two "well-known" properties of subgradient optimization. Mathematical Programming 120, 1 (2009), 213--220.
[4]
Apache Mahout; http://mahout.apache.org/.
[5]
Bittorf, V., Recht, B., Ré, C. and Tropp, J. Factoring nonnegative matrices with linear programs. In Proceedings of Neural Information Processing Systems, 2012.
[6]
Bottou, L. and Bousquet, O. The tradeoffs of large scale learning. In Proceedings of Neural Information Processing Systems, 2007.
[7]
Bottou, L. and LeCun, Y. Large scale online learning. In Proceedings of Neural Information Processing Systems, 2003.
[8]
Boyd, S. and Vandenberghe, L. Convex Optimization. Cambridge University Press, NY, 2004.
[9]
Das, S., Sismanis, Y., Beyer, K. S., Gemulla, R., Haas, P. J. and McPherson, J. Ricardo: Integrating R and Hadoop. In Proceedings of ACM SIGMOD, 2010.
[10]
Feng, X., Kumar, A., Recht, B. and Ré, C. Towards a unified architecture for in-RDBMS analytics. In Proceedings of ACM SIGMOD, 2012.
[11]
Gray, J., Chaudhuri, S., Bosworth, A., Layman, A., Reichart, D., Venkatrao, M., Pellow, F., and Pirahesh, H. Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-totals. Data Mining and Knowledge Discovery 1, 1 (1997).
[12]
Hastie, T., Tibshirani, R. and Friedman, J.H. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer-Verlag, NY, 2011.
[13]
Hellerstein, J., Ré, C., Schoppmann, F., Wang, D. Z., Fratkin, E., Gorajek, A., Ng, K. S., Welton, C., Feng, X., Li, K. and Kumar, A. The MADlib Analytics Library or MAD Skills, the SQL. In Proceedings of the VLDB Endowment 5, 12 (2012): 1700--1711.
[14]
Niu, F., Ré, C., Doan, A. and Shavlik, J. Tuffy: Scaling up statistical inference in Markov logic networks using an RDBMS. In Proceedings of Very Large Databases, 2011.
[15]
Niu, F., Recht, B., Ré, C. and Wright, S. Hogwild!: a lock-free approach to parallelizing stochastic gradient descent. In Proceedings of Neural Information Processing Systems, 2011.
[16]
Niu, F., Zhang, C., Ré, C. and Shavlik, J. Elementary: Large-scale knowledge-base construction via machine learning and statistical inference. International Journal on Semantic Web and Information Systems-Workshop on Web-scale Knowledge Extraction, 2012.
[17]
Recht, B. and Ré, C. Parallel stochastic gradient algorithms for large-scale matrix completion. In Optimization Online, 2012.
[18]
Richardson, M. and Domingos, P. Markov logic networks. Machine Learning 62 (2006), 107--136.
[19]
Rockafellar, R.T. Convex Analysis (Princeton Landmarks in Mathematics and Physics). Princeton University Press, Princeton, NJ, 1996.
[20]
Vowpal Wabbit; http://hunch.net/~vw/.
[21]
Zinkevich, M., Weimer, M., Smola, A. and Li, L. Parallelized stochastic gradient descent. In Proceedings of Neural Information Processing Systems, 2010.

Cited By

View all
  • (2024)Development of an Effective Safety Assessment Capable of Measuring the Safety of AI-Based Systems.2024 IEEE International Conference on Electro Information Technology (eIT)10.1109/eIT60633.2024.10609839(001-008)Online publication date: 30-May-2024
  • (2023)A Formulaic Approach for Selecting Big Data Analytics Tools for Organizational PurposesHandbook of Research on Driving Socioeconomic Development With Big Data10.4018/978-1-6684-5959-1.ch010(224-242)Online publication date: 24-Feb-2023
  • (2022)V-Matrix: A wave theory of value creation for big dataInternational Journal of Accounting Information Systems10.1016/j.accinf.2022.10057547(100575)Online publication date: Dec-2022
  • Show More Cited By

Comments

Information & Contributors

Information

Published In

cover image Communications of the ACM
Communications of the ACM  Volume 56, Issue 3
March 2013
93 pages
ISSN:0001-0782
EISSN:1557-7317
DOI:10.1145/2428556
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 March 2013
Published in CACM Volume 56, Issue 3

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Popular
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)509
  • Downloads (Last 6 weeks)72
Reflects downloads up to 23 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Development of an Effective Safety Assessment Capable of Measuring the Safety of AI-Based Systems.2024 IEEE International Conference on Electro Information Technology (eIT)10.1109/eIT60633.2024.10609839(001-008)Online publication date: 30-May-2024
  • (2023)A Formulaic Approach for Selecting Big Data Analytics Tools for Organizational PurposesHandbook of Research on Driving Socioeconomic Development With Big Data10.4018/978-1-6684-5959-1.ch010(224-242)Online publication date: 24-Feb-2023
  • (2022)V-Matrix: A wave theory of value creation for big dataInternational Journal of Accounting Information Systems10.1016/j.accinf.2022.10057547(100575)Online publication date: Dec-2022
  • (2020)The Internet of Things (IoT)Supply Chain and Logistics Management10.4018/978-1-7998-0945-6.ch091(1907-1924)Online publication date: 2020
  • (2020)Supply Chain 4.0 challengesGestão & Produção10.1590/0104-530x5427-2027:3Online publication date: 2020
  • (2020)A heterogeneous key performance indicator metadata model for air quality monitoring in Sustainable CitiesEnvironmental Modelling & Software10.1016/j.envsoft.2020.104955(104955)Online publication date: Dec-2020
  • (2020)A Replica Structuring for Job Forecasting and Resource Stipulation in Big DataIntelligent Computing Paradigm and Cutting-edge Technologies10.1007/978-3-030-38501-9_31(312-319)Online publication date: 18-Jan-2020
  • (2019)The Internet of Things (IoT)Securing the Internet of Things10.4018/978-1-5225-9866-4.ch071(1557-1574)Online publication date: 6-Sep-2019
  • (2019)Big-Data-Based Techniques for Predictive IntelligencePredictive Intelligence Using Big Data and the Internet of Things10.4018/978-1-5225-6210-8.ch001(1-18)Online publication date: 2019
  • (2019)DeepBaseProceedings of the 2019 International Conference on Management of Data10.1145/3299869.3300073(1117-1134)Online publication date: 25-Jun-2019
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Digital Edition

View this article in digital edition.

Digital Edition

Magazine Site

View this article on the magazine site (external)

Magazine Site

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media