skip to main content
10.1145/3470496.3527409acmconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
research-article

TDGraph: a topology-driven accelerator for high-performance streaming graph processing

Authors Info & Claims
Published:11 June 2022Publication History

ABSTRACT

Many solutions have been recently proposed to support the processing of streaming graphs. However, for the processing of each graph snapshot of a streaming graph, the new states of the vertices affected by the graph updates are propagated irregularly along the graph topology. Despite the years' research efforts, existing approaches still suffer from the serious problems of redundant computation overhead and irregular memory access, which severely underutilizes a many-core processor. To address these issues, this paper proposes a topology-driven programmable accelerator TDGraph, which is the first accelerator to augment the many-core processors to achieve high performance processing of streaming graphs. Specifically, we propose an efficient topology-driven incremental execution approach into the accelerator design for more regular state propagation and better data locality. TDGraph takes the vertices affected by graph updates as the roots to prefetch other vertices along the graph topology and synchronizes the incremental computations of them on the fly. In this way, most state propagations originated from multiple vertices affected by different graph updates can be conducted together along the graph topology, which help reduce the redundant computations and data access cost. Besides, through the efficient coalescing of the accesses to vertex states, TDGraph further improves the utilization of the cache and memory bandwidth. We have evaluated TDGraph on a simulated 64-core processor. The results show that, the state-of-the-art software system achieves the speedup of 7.1~21.4 times after integrating with TDGraph, while incurring only 0.73% area cost. Compared with four cutting-edge accelerators, i.e., HATS, Minnow, PHI, and DepGraph, TDGraph gains the speedups of 4.6~12.7, 3.2~8.6, 3.8~9.7, and 2.3~6.1 times, respectively.

References

  1. 2022. DDR4 SDRAM System Power Calculator. https://media-www.micron.com/-/media/client/global/documents/products/power-calculator/ddr4_power_calc.xlsm?rev=a8a5e30d8a7e41c4adcaad2df73934b4.Google ScholarGoogle Scholar
  2. 2022. macsim. https://github.com/gthparch/macsim.Google ScholarGoogle Scholar
  3. 2022. SNAP. http://snap.stanford.edu/data/index.html.Google ScholarGoogle Scholar
  4. Junwhan Ahn, Sungpack Hong, Sungjoo Yoo, Onur Mutlu, and Kiyoung Choi. 2015. A scalable processing-in-memory accelerator for parallel graph processing. In Proceedings of the 42nd Annual International Symposium on Computer Architecture. 105--117.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Sam Ainsworth and Timothy M. Jones. 2016. Graph Prefetching Using Data Structure Knowledge. In Proceedings of the 2016 International Conference on Supercomputing. 39:1--39:11 pages.Google ScholarGoogle Scholar
  6. Sam Ainsworth and Timothy M. Jones. 2018. An Event-Triggered Programmable Prefetcher for Irregular Workloads. In Proceedings of the 23rd International Conference on Architectural Support for Programming Languages and Operating Systems. 578--592.Google ScholarGoogle Scholar
  7. Sam Ainsworth and Timothy M. Jones. 2019. Software Prefetching for Indirect Memory Accesses: A Microarchitectural Perspective. ACM Transactions on Computer Systems 36, 3 (2019), 8:1--8:34.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Mikhail Asiatici and Paolo Ienne. 2021. Large-Scale Graph Processing on FPGAs with Caches for Thousands of Simultaneous Misses. In Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture. 609--622.Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Vignesh Balaji, Neal Crago, Aamer Jaleel, and Brandon Lucia. 2021. P-OPT: Practical Optimal Cache Replacement for Graph Analytics. In Proceedings of the 27th IEEE International Symposium on High-Performance Computer Architecture. 668--681.Google ScholarGoogle ScholarCross RefCross Ref
  10. Abanti Basak, Shuangchen Li, Xing Hu, Sang Min Oh, Xinfeng Xie, Li Zhao, Xiaowei Jiang, and Yuan Xie. 2019. Analysis and Optimization of the Memory Hierarchy for Graph Processing Workloads. In Proceedings of the 25th IEEE International Symposium on High Performance Computer Architecture. 373--386.Google ScholarGoogle ScholarCross RefCross Ref
  11. Abanti Basak, Zheng Qu, Jilan Lin, Alaa R. Alameldeen, Zeshan Chishti, Yufei Ding, and Yuan Xie. 2021. Improving Streaming Graph Processing Performance using Input Knowledge. In Proceedings of the 54th Annual IEEE/ACM International Symposium on Microarchitecture. 1036--1050.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Robert D. Blumofe and Charles E. Leiserson. 1999. Scheduling Multithreaded Computations by Work Stealing. Journal of the ACM 46, 5 (1999), 720--748.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Nagadastagiri Challapalle, Sahithi Rampalli, Linghao Song, Nandhini Chandramoorthy, Karthik Swaminathan, John Sampson, Yiran Chen, and Vijaykrishnan Narayanan. 2020. GaaS-X: Graph Analytics Accelerator Supporting Sparse Data Representation using Crossbar Architectures. In Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture. 433--445.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Raymond Cheng, Ji Hong, Aapo Kyrola, Youshan Miao, Xuetian Weng, Ming Wu, Fan Yang, Lidong Zhou, Feng Zhao, and Enhong Chen. 2012. Kineograph: taking the pulse of a fast-changing and connected world. In Proceedings of the 7th European Conference on Computer Systems. 85--98.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. David Culler, Jaswinder Pal Singh, and Anoop Gupta. 1999. Parallel computer architecture: a hardware/software approach. Gulf Professional Publishing.Google ScholarGoogle Scholar
  16. Guohao Dai, Tianhao Huang, Yuze Chi, Ningyi Xu, Yu Wang, and Huazhong Yang. 2017. ForeGraph: Exploring Large-scale Graph Processing on Multi-FPGA Architecture. In Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. 217--226.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Chantat Eksombatchai, Pranav Jindal, Jerry Zitao Liu, Yuchen Liu, Rahul Sharma, Charles Sugnet, Mark Ulrich, and Jure Leskovec. 2018. Pixie: A System for Recommending 3+ Billion Items to 200+ Million Users in Real-Time. In Proceedings of the 2018 World Wide Web Conference. 1775--1784.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Dhivya Eswaran, Christos Faloutsos, Sudipto Guha, and Nina Mishra. 2018. Spot-Light: Detecting Anomalies in Streaming Graphs. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1378--1386.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Priyank Faldu, Jeff Diamond, and Boris Grot. 2020. Domain-Specialized Cache Management for Graph Analytics. In Proceedings of the 26th IEEE International Symposium on High Performance Computer Architecture. 234--248.Google ScholarGoogle ScholarCross RefCross Ref
  20. Wenfei Fan, Chunming Hu, and Chao Tian. 2017. Incremental Graph Computations: Doable and Undoable. In Proceedings of the 2017 ACM International Conference on Management of Data. 155--169.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Shufeng Gong, Chao Tian, Qiang Yin, Wenyuan Yu, Yanfeng Zhang, Liang Geng, Song Yu, Ge Yu, and Jingren Zhou. 2021. Automating Incremental Graph Processing with Flexible Memoization. Proceedings of the VLDB Endowment 14, 9 (2021), 1613--1625.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Joseph E. Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, and Carlos Guestrin. 2012. PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs. In Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation. 17--30.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Tae Jun Ham, Lisa Wu, Narayanan Sundaram, Nadathur Satish, and Margaret Martonosi. 2016. Graphicionado: A high-performance and energy-efficient accelerator for graph analytics. In Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture. 56:1--56:13.Google ScholarGoogle ScholarCross RefCross Ref
  24. Wentao Han, Youshan Miao, Kaiwei Li, Ming Wu, Fan Yang, Lidong Zhou, Vijayan Prabhakaran, Wenguang Chen, and Enhong Chen. 2014. Chronos: a graph engine for temporal graph analysis. In Proceedings of the 9th European Conference on Computer Systems. 1:1--1:14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Aamer Jaleel, Kevin B. Theobald, Simon C. Steely Jr., and Joel S. Emer. 2010. High performance cache replacement using re-reference interval prediction. In Proceedings of the 37th International Symposium on Computer Architecture. 60--71.Google ScholarGoogle Scholar
  26. Xiaolin Jiang, Chengshuo Xu, Xizhe Yin, Zhijia Zhao, and Rajiv Gupta. 2021. Tripoline: generalized incremental graph processing via graph triangle inequality. In Proceedings of the 16th European Conference on Computer Systems. 17--32.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Daniel A. Jiménez. 2013. Insertion and promotion for tree-based PseudoLRU last-level caches. In Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture. 284--296.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Sang Woo Jun, Andy Wright, Sizhuo Zhang, Shuotao Xu, and Arvind. 2018. GraFBoost: Using Accelerated Flash Storage for External Graph Analytics. In Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture. 411--424.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Kevin M. Lepak and Mikko H. Lipasti. 2002. Temporally silent stores. In Proceedings of the 10th International Conference on Architectural Support for Programming Languages and Operating Systems. 30--41.Google ScholarGoogle Scholar
  30. Jure Leskovec, Jon M. Kleinberg, and Christos Faloutsos. 2005. Graphs over time: densification laws, shrinking diameters and possible explanations. In Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 177--187.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Sheng Li, Jung Ho Ahn, Richard D. Strong, Jay B. Brockman, Dean M. Tullsen, and Norman P. Jouppi. 2009. McPAT: an integrated power, area, and timing modeling framework for multicore and manycore architectures. In Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture. 469--480.Google ScholarGoogle Scholar
  32. Mugilan Mariappan, Joanna Che, and Keval Vora. 2021. DZiG: sparsity-aware incremental processing of streaming graphs. In Proceedings of the 16th European Conference on Computer Systems. 83--98.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Mugilan Mariappan and Keval Vora. 2019. GraphBolt: Dependency-Driven Synchronous Processing of Streaming Graphs. In Proceedings of the 14th EuroSys Conference 2019. 25:1--25:16.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Kiran Kumar Matam, Gunjae Koo, Haipeng Zha, Hung-Wei Tseng, and Murali Annavaram. 2019. GraphSSD: graph semantics aware SSD. In Proceedings of the 46th International Symposium on Computer Architecture9. 116--128.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Andrew McCrabb, Eric Winsor, and Valeria Bertacco. 2019. DREDGE: Dynamic Repartitioning during Dynamic Graph Execution. In Proceedings of the 56th Annual Design Automation Conference. 28.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Anurag Mukkara, Nathan Beckmann, Maleen Abeydeera, Xiaosong Ma, and Daniel Sánchez. 2018. Exploiting Locality in Graph Analytics through Hardware-Accelerated Traversal Scheduling. In Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture. 1--14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Anurag Mukkara, Nathan Beckmann, and Daniel Sánchez. 2019. PHI: Architectural Support for Synchronization- and Bandwidth-Efficient Commutative Scatter Updates. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture. 1009--1022.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Derek Gordon Murray, Frank McSherry, Rebecca Isaacs, Michael Isard, Paul Barham, and Martín Abadi. 2013. Naiad: a timely dataflow system. In Proceedings of the ACM SIGOPS 24th Symposium on Operating Systems Principles. 439--455.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Lifeng Nai, Ramyad Hadidi, Jaewoong Sim, Hyojong Kim, Pranith Kumar, and Hyesoon Kim. 2017. GraphPIM: Enabling Instruction-Level PIM Offloading in Graph Computing Frameworks. In Proceedings of the 2017 IEEE International Symposium on High Performance Computer Architecture. 457--468.Google ScholarGoogle ScholarCross RefCross Ref
  40. Quan M. Nguyen and Daniel Sánchez. 2021. Fifer: Practical Acceleration of Irregular Applications on Reconfigurable Architectures. In Proceedings of the 54th Annual IEEE/ACM International Symposium on Microarchitecture. 1064--1077.Google ScholarGoogle Scholar
  41. Muhammet Mustafa Ozdal, Serif Yesil, Taemin Kim, Andrey Ayupov, John Greth, Steven M.Burns, and Özcan Özturk. 2016. Energy Efficient Architecture for Graph Analytics Accelerators. In Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture. 166--177.Google ScholarGoogle Scholar
  42. Xiafei Qiu, Wubin Cen, Zhengping Qian, You Peng, Ying Zhang, Xuemin Lin, and Jingren Zhou. 2018. Real-time Constrained Cycle Detection in Large Dynamic Graphs. Proceedings of the VLDB Endowment 11, 12 (2018), 1876--1888.Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Shafiur Rahman, Nael Abu-Ghazaleh, and Rajiv Gupta. 2020. GraphPulse: An Event-Driven Hardware Accelerator for Asynchronous Graph Processing. In Proceedings of the 53rd IEEE/ACM International Symposium on Microarchitecture. 908--921.Google ScholarGoogle ScholarCross RefCross Ref
  44. Shafiur Rahman, Mahbod Afarin, Nael B. Abu-Ghazaleh, and Rajiv Gupta. 2021. JetStream: Graph Analytics on Streaming Data with Event-Driven Hardware Accelerator. In Proceedings of the 54th Annual IEEE/ACM International Symposium on Microarchitecture. 1091--1105.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Kenneth A. Ross. 2007. Efficient Hash Probes on Modern Processors. In Proceedings of the 23rd International Conference on Data Engineering. 1297--1301.Google ScholarGoogle ScholarCross RefCross Ref
  46. Daniel Sánchez and Christos Kozyrakis. 2013. ZSim: fast and accurate microarchitectural simulation of thousand-core systems. In Proceedings of the 40th Annual International Symposium on Computer Architecture. 475--486.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. David Sayce. 2020. The Number of tweets per day in 2020. https://www.dsayce.com/social-media/tweets-day/.Google ScholarGoogle Scholar
  48. Steven L. Scott. 1996. Synchronization and Communication in the T3E Multiprocessor. In Proceedings of the 7th International Conference on Architectural Support for Programming Languages and Operating Systems. 26--36.Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Albert Segura, Jose-Maria Arnau, and Antonio González. 2019. SCU: a GPU stream compaction unit for graph processing. In Proceedings of the 46th International Symposium on Computer Architecture. 424--435.Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Albert Segura, Jose-Maria Arnau, and Antonio Gonzalez. 2021. Energy-Efficient Stream Compaction Through Filtering and Coalescing Accesses in GPGPU Memory Partitions. IEEE Trans. Comput. (2021), 1--12. Google ScholarGoogle ScholarCross RefCross Ref
  51. Dipanjan Sengupta, Narayanan Sundaram, Xia Zhu, Theodore L. Willke, Jeffrey S. Young, Matthew Wolf, and Karsten Schwan. 2016. GraphIn: An Online High Performance Incremental Graph Processing Framework. In Proceedings of the 22nd International Conference on Parallel and Distributed Computing. 319--333.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Feng Sheng, Qiang Cao, Haoran Cai, Jie Yao, and Changsheng Xie. 2018. GraPU: Accelerate Streaming Graph Analysis through Preprocessing Buffered Updates. In Proceedings of the 2018 ACM Symposium on Cloud Computing. 301--312.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Xiaogang Shi, Bin Cui, Yingxia Shao, and Yunhai Tong. 2016. Tornado: A System For Real-Time Iterative Analysis Over Evolving Data. In Proceedings of the 2016 International Conference on Management of Data. 417--430.Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Julian Shun and Guy E. Blelloch. 2013. Ligra: a lightweight graph processing framework for shared memory. In Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 135--146.Google ScholarGoogle Scholar
  55. Avinash Sodani, Roger Gramunt, Jesüs Corbal, Ho-Seop Kim, Krishna Vinod, Sundaram Chinthamani, Steven Hutsell, Rajat Agarwal, and Yen-Chen Liu. 2016. Knights Landing: Second-Generation Intel Xeon Phi Product. IEEE Micro 36, 2 (2016), 34--46.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Linghao Song, Youwei Zhuo, Xuehai Qian, Hai Helen Li, and Yiran Chen. 2018. GraphR: Accelerating Graph Processing Using ReRAM. In Proceedings of the 24th IEEE International Symposium on High Performance Computer Architecture. 531--543.Google ScholarGoogle ScholarCross RefCross Ref
  57. Shuang Song, Xu Liu, Qinzhe Wu, Andreas Gerstlauer, Tao Li, and Lizy K. John. 2018. Start Late or Finish Early: A Distributed Graph Processing System with Redundancy Reduction. Proceedings of the VLDB Endowment 12, 2 (2018), 154--168.Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Yanwei Song and Engin Ipek. 2015. More is less: improving the energy efficiency of data movement via opportunistic use of sparse codes. In Proceedings of the 48th International Symposium on Microarchitecture. ACM, 242--254.Google ScholarGoogle ScholarDigital LibraryDigital Library
  59. Pourya Vaziri and Keval Vora. 2021. Controlling Memory Footprint of Stateful Streaming Graph Processing. In Proceedings of the 2021 USENIX Annual Technical Conference. 269--283.Google ScholarGoogle Scholar
  60. Keval Vora, Rajiv Gupta, and Guoqing Xu. 2016. Synergistic Analysis of Evolving Graphs. ACM Transactions on Architecture and Code Optimization 13, 4 (2016), 32:1--32:27.Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Keval Vora, Rajiv Gupta, and Guoqing Xu. 2017. KickStarter: Fast and Accurate Computations on Streaming Graphs via Trimmed Approximations. In Proceedings of the 23rd International Conference on Architectural Support for Programming Languages and Operating Systems. 237--251.Google ScholarGoogle ScholarDigital LibraryDigital Library
  62. Chenning Xie, Rong Chen, Haibing Guan, Binyu Zang, and Haibo Chen. 2015. SYNC or ASYNC: time to fuse for distributed graph-parallel computation. In Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 194--204.Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Mingyu Yan, Xing Hu, Shuangchen Li, Abanti Basak, Han Li, Xin Ma, Itir Akgun, Yujing Feng, Peng Gu, Lei Deng, Xiaochun Ye, Zhimin Zhang, Dongrui Fan, and Yuan Xie. 2019. Alleviating Irregularity in Graph Analytics Acceleration: a Hardware/Software Co-Design Approach. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture. 615--628.Google ScholarGoogle ScholarDigital LibraryDigital Library
  64. Yifan Yang, Joel S. Emer, and Daniel Sanchez. 2021. SpZip: Architectural Support for Effective Data Compression In Irregular Applications. In Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture. 1070--1082.Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Yifan Yang, Zhaoshi Li, Yangdong Deng, Zhiwei Liu, Shouyi Yin, Shaojun Wei, and Leibo Liu. 2020. GraphABCD: Scaling Out Graph Analytics with Asynchronous Block Coordinate Descent. In Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture. 419--432.Google ScholarGoogle ScholarDigital LibraryDigital Library
  66. Xiangyao Yu, Christopher J. Hughes, Nadathur Satish, and Srinivas Devadas. 2015. IMP: indirect memory prefetcher. In Proceedings of the 48th International Symposium on Microarchitecture. 178--190.Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Dan Zhang, Xiaoyu Ma, Michael Thomson, and Derek Chiou. 2018. Minnow: Lightweight Offload Engines for Worklist Management and Worklist-Directed Prefetching. In Proceedings of the 23rd International Conference on Architectural Support for Programming Languages and Operating Systems. 593--607.Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Guowei Zhang, Virginia Chiu, and Daniel Sanchez. 2016. Exploiting Semantic Commutativity in Hardware Speculation. In Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture. Article 34:1--34:12.Google ScholarGoogle Scholar
  69. Guowei Zhang, Webb Horn, and Daniel Sanchez. 2015. Exploiting Commutativity to Reduce the Cost of Updates to Shared Data in Cache-Coherent Systems. In Proceedings of the 48th Annual IEEE/ACM International Symposium on Microarchitecture. 13--25.Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Mingxing Zhang, Yongwei Wu, Youwei Zhuo, Xuehai Qian, Chengying Huan, and Kang Chen. 2018. Wonderland: A Novel Abstraction-Based Out-Of-Core Graph Processing System. In Proceedings of the 23rd International Conference on Architectural Support for Programming Languages and Operating Systems. 608--621.Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. Mingxing Zhang, Youwei Zhuo, Chao Wang, Mingyu Gao, Yongwei Wu, Kang Chen, Christos Kozyrakis, and Xuehai Qian. 2018. GraphP: Reducing Communication for PIM-Based Graph Processing with Efficient Data Partition. In Proceedings of the 2018 IEEE International Symposium on High Performance Computer Architecture. 544--557.Google ScholarGoogle ScholarCross RefCross Ref
  72. Yu Zhang, Xiaofei Liao, Hai Jin, Lin Gu, and Bing Bing Zhou. 2018. FBSGraph: Accelerating Asynchronous Graph Processing via Forward and Backward Sweeping. IEEE Transactions on Knowledge and Data Engineering 30, 5 (2018), 895--907.Google ScholarGoogle ScholarCross RefCross Ref
  73. Yu Zhang, Xiaofei Liao, Hai Jin, Ligang He, Bingsheng He, Haikun Liu, and Lin Gu. 2021. DepGraph: A Dependency-Driven Accelerator for Efficient Iterative Graph Processing. In Proceedings of the 2021 IEEE International Symposium on High-Performance Computer Architecture. 371--384.Google ScholarGoogle ScholarCross RefCross Ref
  74. Jin Zhao, Yu Zhang, Xiaofei Liao, Ligang He, Bingsheng He, Hai Jin, and Haikun Liu. 2021. LCCG: a locality-centric hardware accelerator for high throughput of concurrent graph processing. In Proceedings of the 2021 International Conference for High Performance Computing, Networking, Storage and Analysis. 45:1--45:14.Google ScholarGoogle Scholar
  75. Ruohuang Zheng and Sreepathi Pai. 2021. Efficient Execution of Graph Algorithms on CPU with SIMD Extensions. In Proceedings of the 2021 IEEE/ACM International Symposium on Code Generation and Optimization. 262--276.Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. Youwei Zhuo, Chao Wang, Mingxing Zhang, Rui Wang, Dimin Niu, Yanzhi Wang, and Xuehai Qian. 2019. GraphQ: Scalable PIM-Based Graph Processing. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture. 712--725.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. TDGraph: a topology-driven accelerator for high-performance streaming graph processing

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          ISCA '22: Proceedings of the 49th Annual International Symposium on Computer Architecture
          June 2022
          1097 pages
          ISBN:9781450386104
          DOI:10.1145/3470496

          Copyright © 2022 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 11 June 2022

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          ISCA '22 Paper Acceptance Rate67of400submissions,17%Overall Acceptance Rate543of3,203submissions,17%

          Upcoming Conference

          ISCA '24

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader