skip to main content
10.1145/3243176.3243190acmconferencesArticle/Chapter ViewAbstractPublication PagespactConference Proceedingsconference-collections
research-article

Data motifs: a lens towards fully understanding big data and AI workloads

Published: 01 November 2018 Publication History

Abstract

The complexity and diversity of big data and AI workloads make understanding them difficult and challenging. This paper proposes a new approachto modelling and characterizing big data and AI workloads. We consider each big data and AI workload as a pipeline of one or more classes of units of computation performed on different initial or intermediate data inputs. Each class of unit of computation captures the common requirements while being reasonably divorced from individual implementations, and hence we call it a data motif. For the first time, among a wide variety of big data and AI workloads, we identify eight data motifs that take up most of the run time of those workloads, including Matrix, Sampling, Logic, Transform, Set, Graph, Sort and Statistic. We implement the eight data motifs on different software stacks as the micro benchmarks of an open-source big data and AI benchmark suite --- BigDataBench 4.0 (publicly available from http://prof.ict.ac.cn/BigDataBench), and perform comprehensive characterization of those data motifs from perspective of data sizes, types, sources, and patterns as a lens towards fully understanding big data and AI workloads. We believe the eight data motifs are promising abstractions and tools for not only big data and AI benchmarking, but also domain-specific hardware and software co-design.

References

[1]
2018. Hadoop. http://hadoop.apache.org/. (2018).
[2]
2018. LSD. https://software.intel.com/en-us/vtune-amplifier-help-front-end-bandwidth-lsd. (2018).
[3]
2018. Perf tool. https://perf.wiki.kernel.org/index.php/Main_Page. (2018).
[4]
2018. PMU Tools. https://github.com/andikleen/pmu-tools. (2018).
[5]
Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. 2016. TensorFlow: A System for Large-Scale Machine Learning. In OSDI, Vol. 16. 265--283.
[6]
Krste Asanovic, Ras Bodik, Bryan Christopher Catanzaro, Joseph James Gebis, Parry Husbands, Kurt Keutzer, David A Patterson, William Lester Plishker, John Shalf, Samuel Webb Williams, and Yelick Katherine. 2006. The landscape of parallel computing research: A view from Berkeley. Technical Report. Technical Report UCB/EECS-2006-183, EECS Department, University of California, Berkeley.
[7]
David H Bailey, Eric Barszcz, John T Barton, David S Browning, Robert L Carter, Leonardo Dagum, Rod A Fatoohi, Paul O Frederickson, Thomas A Lasinski, Rob S Schreiber, H D Simon, V Venkatakrishnan, and S K Weeratunga. 1991. The NAS parallel benchmarks. The International Journal of Supercomputing Applications 5, 3 (1991), 63--73.
[8]
Blaise Barney. 2009. POSIX threads programming. National Laboratory. Disponível em:<https://computing.llnl.gov/tutorials/pthreads/> Acesso em 5 (2009), 46.
[9]
Tianshi Chen, Zidong Du, Ninghui Sun, Jia Wang, Chengyong Wu, Yunji Chen, and Olivier Temam. 2014. Diannao: A small-footprint high-throughput accelerator for ubiquitous machine-learning. ACM Sigplan Notices 49, 4 (2014), 269--284.
[10]
Yanpei Chen, Francois Raab, and Randy Katz. 2014. From tpc-c to big data benchmarks: A functional workload model. In Specifying Big Data Benchmarks. Springer, 28--43.
[11]
Edgar F Codd. 1970. A relational model of data for large shared data banks. Commun. ACM 13, 6 (1970), 377--387.
[12]
Phillip Colella. 2004. Defining software requirements for scientific computing. (2004).
[13]
James W Cooley and John W Tukey. 1965. An algorithm for the machine calculation of complex Fourier series. Mathematics of computation 19, 90 (1965), 297--301.
[14]
NR Council. 2013. Frontiers in Massive Data Analysis. The National Academies Press Washington, DC.
[15]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. IEEE, 248--255.
[16]
Jack J Dongarra, Piotr Luszczek, and Antoine Petitet. 2003. The LINPACK benchmark: past, present and future. Concurrency and Computation: practice and experience 15, 9 (2003), 803--820.
[17]
Lieven Eeckhout, Hans Vandierendonck, and Koen De Bosschere. 2003. Quantifying the impact of input data sets on program behavior and its applications. Journal of Instruction-Level Parallelism 5, 1 (2003), 1--33.
[18]
Michael Ferdman, Almutaz Adileh, Onur Kocberber, Stavros Volos, Mohammad Alisafaee, Djordje Jevdjic, Cansu Kaynak, Adrian Daniel Popescu, Anastasia Ailamaki, and Babak Falsafı. 2012. Clearing the Clouds: A Study of Emerging Workloads on Modern Hardware. In ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS).
[19]
Wanling Gao, Jianfeng Zhan, Lei Wang, Chunjie Luo, Zhen Jia, Daoyi Zheng, Chen Zheng, Xiwen He, Hainan Ye, Haibin Wang, and Rui Ren. 2018. Data Motif-based Proxy Benchmarks for Big Data and AI Workloads. Workload Characterization (IISWC), 2018 IEEE International Symposium on (2018).
[20]
Wanling Gao, Jianfeng Zhan, Lei Wang, Chunjie Luo, Daoyi Zheng, Xu Wen, Rui Ren, Chen Zheng, Hainan Ye, Jiahui Dai, Zheng Cao, et al. 2018. BigDataBench: A Scalable and Unified Big Data and AI Benchmark Suite. Under review of IEEE Transaction on Parallel and Distributed Systems (2018).
[21]
Andrew Glew. 1998. MLP yes! ILP no. ASPLOS Wild and Crazy Idea Session'98 (1998).
[22]
Part Guide. 2011. Intel® 64 and IA-32 Architectures Software Developerś Manual. Volume 3B: System programming Guide, Part 2 (2011).
[23]
Dominique Guinard, Vlad Trifa, and Erik Wilde. 2010. A resource oriented architecture for the web of things. In Internet of Things (IOT), 2010. IEEE, 1--8.
[24]
John Hennessy and David Patterson. 2018. A New Golden Age for Computer Architecture: Domain-specific Hardware/Software Co-Design, Enhanced Security, Open Instruction Sets, and Agile Chip Development. (2018).
[25]
Zhen Jia, Jianfeng Zhan, Lei Wang, Rui Han, Sally A McKee, Qiang Yang, Chunjie Luo, and Jingwei Li. 2014. Characterizing and subsetting big data workloads. In IEEE International Symposium on Workload Characterization (IISWC).
[26]
Stephen C Johnson. 1967. Hierarchical clustering schemes. Psychometrika 32, 3 (1967), 241--254.
[27]
Ian T Jolliffe. 1986. Principal component analysis and factor analysis. In Principal component analysis. Springer, 115--128.
[28]
Norman P Jouppi, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, Suresh Bhatia, Nan Boden, Al Borchers, et al. 2017. In-datacenter performance analysis of a tensor processing unit. In Computer Architecture (ISCA), 2017 ACM/IEEE 44th Annual International Symposium on. IEEE, 1--12.
[29]
Gwangsun Kim, Jiyun Jeong, John Kim, and Mark Stephenson. 2016. Automatically exploiting implicit Pipeline Parallelism from multiple dependent kernels for GPUs. In Parallel Architecture and Compilation Techniques (PACT), 2016 International Conference on. IEEE, 339--350.
[30]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105.
[31]
David J Lilja. 2005. Measuring computer performance: a practitioner's guide. Cambridge university press.
[32]
David G Lowe. 2004. Distinctive image features from scale-invariant keypoints. International journal of computer vision 60, 2 (2004), 91--110.
[33]
Piotr R Luszczek, David H Bailey, Jack J Dongarra, Jeremy Kepner, Robert F Lucas, Rolf Rabenseifner, and Daisuke Takahashi. 2006. The HPC Challenge (HPCC) benchmark suite. In Proceedings of the 2006 ACM/IEEE conference on Supercomputing. Citeseer, 213.
[34]
David Maier. 1983. The theory of relational databases. Vol. 11. Computer science press Rockville.
[35]
John D Owens, Mike Houston, David Luebke, Simon Green, John E Stone, and James C Phillips. 2008. GPU computing. Proc. IEEE 96, 5 (2008), 879--899.
[36]
Heather Quinn, William H Robinson, Paolo Rech, Miguel Aguirre, Arno Barnard, Marco Desogus, Luis Entrena, Mario Garcia-Valderas, Steven M Guertin, David Kaeli, et al. 2015. Using benchmarks for radiation testing of microprocessors and FPGAs. IEEE Transactions on Nuclear Science 62, 6 (2015), 2547--2554.
[37]
Mehul Shah, Parthasarathy Ranganathan, Jichuan Chang, Niraj Tolia, David Roberts, and Trevor Mudge. 2010. Data dwarfs: Motivating a coverage set for future large data center workloads. In Proc. Workshop Architectural Concerns in Large Datacenters.
[38]
Konstantin Shvachko, Hairong Kuang, Sanjay Radia, and Robert Chansler. 2010. The hadoop distributed file system. In Mass storage systems and technologies (MSST), 2010 IEEE 26th symposium on. Ieee, 1--10.
[39]
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
[40]
Sam Van den Steen, Stijn Eyerman, Sander De Pestel, Moncef Mechri, Trevor E Carlson, David Black-Schaffer, Erik Hagersten, and Lieven Eeckhout. 2016. Analytical processor performance and power modeling using micro-architecture independent characteristics. IEEE Trans. Comput. 65, 12 (2016), 3537--3551.
[41]
Lei Wang, Jianfeng Zhan, Chunjie Luo, Yuqing Zhu, Qiang Yang, Yongqiang He, Wanling Gao, Zhen Jia, Yingjie Shi, Shujie Zhang, Chen Zheng, Gang Lu, Kent Zhan, Xiaona Li, and Bizhu Qiu. 2014. Bigdatabench: A big data benchmark suite from internet services. In IEEE International Symposium On High Performance Computer Architecture (HPCA).
[42]
Wm A Wulf and Sally A McKee. 1995. Hitting the memory wall: implications of the obvious. ACM SIGARCH computer architecture news 23, 1 (1995), 20--24.
[43]
Biwei Xie, Jianfeng Zhan, Xu Liu, Wanling Gao, Zhen Jia, Xiwen He, and Lixin Zhang. 2018. CVR: Efficient Vectorization of SpMV on X86 Processors. In 2018 IEEE/ACM International Symposium on Code Generation and Optimization (CGO).
[44]
Ahmad Yasin. 2014. A top-down method for performance analysis and counters architecture. In Performance Analysis of Systems and Software (ISPASS), 2014 IEEE International Symposium on. IEEE, 35--44.
[45]
Buse Yilmaz, Bariş Aktemur, MaríA J Garzarán, Sam Kamin, and Furkan Kiraç. 2016. Autotuning runtime specialization for sparse matrix-vector multiplication. ACM Transactions on Architecture and Code Optimization (TACO) 13, 1 (2016), 5.
[46]
Matei Zaharia, Mosharaf Chowdhury, Michael J Franklin, Scott Shenker, and Ion Stoica. 2010. Spark: cluster computing with working sets. In Proceedings of the 2nd USENIX conference on Hot topics in cloud computing. 10--10.

Cited By

View all
  • (2024)A Linear Combination-Based Method to Construct Proxy Benchmarks for Big Data WorkloadsBenchmarking, Measuring, and Optimizing10.1007/978-981-97-0316-6_8(120-136)Online publication date: 14-Feb-2024
  • (2023)A Structured Approach Towards Big Data IdentificationIEEE Transactions on Big Data10.1109/TBDATA.2021.31390699:1(147-159)Online publication date: 1-Feb-2023
  • (2023)Profiling gem5 Simulator2023 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)10.1109/ISPASS57527.2023.00019(103-113)Online publication date: Apr-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PACT '18: Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques
November 2018
494 pages
ISBN:9781450359863
DOI:10.1145/3243176
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • IFIP WG 10.3: IFIP WG 10.3
  • IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication Notes

Badge change: Article originally badged under Version 1.0 guidelines https://www.acm.org/publications/policies/artifact-review-badging

Publication History

Published: 01 November 2018

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. AI
  2. big data
  3. data motif
  4. workload characterization

Qualifiers

  • Research-article

Conference

PACT '18
Sponsor:

Acceptance Rates

Overall Acceptance Rate 121 of 471 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)47
  • Downloads (Last 6 weeks)6
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)A Linear Combination-Based Method to Construct Proxy Benchmarks for Big Data WorkloadsBenchmarking, Measuring, and Optimizing10.1007/978-981-97-0316-6_8(120-136)Online publication date: 14-Feb-2024
  • (2023)A Structured Approach Towards Big Data IdentificationIEEE Transactions on Big Data10.1109/TBDATA.2021.31390699:1(147-159)Online publication date: 1-Feb-2023
  • (2023)Profiling gem5 Simulator2023 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)10.1109/ISPASS57527.2023.00019(103-113)Online publication date: Apr-2023
  • (2023)Evaluating the Carbon Impact of Large Language Models at the Inference Stage2023 IEEE International Performance, Computing, and Communications Conference (IPCCC)10.1109/IPCCC59175.2023.10253886(150-157)Online publication date: 17-Nov-2023
  • (2023)Radiology, AI and Big Data: Challenges and Opportunities for Medical ImagingTrends of Artificial Intelligence and Big Data for E-Health10.1007/978-3-031-11199-0_3(33-55)Online publication date: 2-Jan-2023
  • (2022) A Labeled Architecture for Low-Entropy Clouds: Theory, Practice, and Lessons Intelligent Computing10.34133/2022/97954762022Online publication date: Jan-2022
  • (2022)Performance Analysis of Big Data Motifs on Large Core Machines2022 IEEE International Conference on Cloud Computing in Emerging Markets (CCEM)10.1109/CCEM57073.2022.00014(37-42)Online publication date: 12-Dec-2022
  • (2022)A BenchCouncil view on benchmarking emerging and future computingBenchCouncil Transactions on Benchmarks, Standards and Evaluations10.1016/j.tbench.2022.1000642:2(100064)Online publication date: Apr-2022
  • (2022)Open-source computer systems initiative: The motivation, essence, challenges, and methodologyBenchCouncil Transactions on Benchmarks, Standards and Evaluations10.1016/j.tbench.2022.1000382:1(100038)Online publication date: Mar-2022
  • (2021)AIBench Scenario: Scenario-distilling AI BenchmarkingProceedings of the 30th International Conference on Parallel Architectures and Compilation Techniques10.1109/PACT52795.2021.00018(142-158)Online publication date: 26-Sep-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media