research-article

Random Walks on Huge Graphs at Cache Efficiency

Authors:
Ke Yang

Department of Computer Science and Technology, Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University, China, Qatar Computing Research Institute, Hamad Bin Khalifa University and Beijing HaiZhi XingTu Technology Co., Ltd.

Department of Computer Science and Technology, Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University, China, Qatar Computing Research Institute, Hamad Bin Khalifa University and Beijing HaiZhi XingTu Technology Co., Ltd.
View Profile

,
Xiaosong Ma

Qatar Computing Research Institute, Hamad Bin Khalifa University

Qatar Computing Research Institute, Hamad Bin Khalifa University
View Profile

,
Saravanan Thirumuruganathan

Qatar Computing Research Institute, Hamad Bin Khalifa University

Qatar Computing Research Institute, Hamad Bin Khalifa University
View Profile

,
Kang Chen

Department of Computer Science and Technology, Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University, China and Beijing HaiZhi XingTu Technology Co., Ltd.

Department of Computer Science and Technology, Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University, China and Beijing HaiZhi XingTu Technology Co., Ltd.
View Profile

,
Yongwei Wu

Department of Computer Science and Technology, Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University, China and Beijing HaiZhi XingTu Technology Co., Ltd.

Department of Computer Science and Technology, Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University, China and Beijing HaiZhi XingTu Technology Co., Ltd.
View Profile

SOSP '21: Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems PrinciplesOctober 2021Pages 311–326https://doi.org/10.1145/3477132.3483575

Published:26 October 2021Publication History

SOSP '21: Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles

Pages 311–326

ABSTRACT

Data-intensive applications dominated by random accesses to large working sets fail to utilize the computing power of modern processors. Graph random walk, an indispensable workhorse for many important graph processing and learning applications, is one prominent case of such applications. Existing graph random walk systems are currently unable to match the GPU-side node embedding training speed.

This work reveals that existing approaches fail to effectively utilize the modern CPU memory hierarchy, due to the widely held assumption that the inherent randomness in random walks and the skewed nature of graphs render most memory accesses random. We demonstrate that there is actually plenty of spatial and temporal locality to harvest, by careful partitioning, rearranging, and batching of operations. The resulting system, FlashMob, improves both cache and memory bandwidth utilization by making memory accesses more sequential and regular. We also found that a classical combinatorial optimization problem (and its exact pseudo-polynomial solution) can be applied to complex decision making, for accurate yet efficient data/task partitioning. Our comprehensive experiments over diverse graphs show that our system achieves an order of magnitude performance improvement over the fastest existing system. It processes a 58GB real graph at higher per-step speed than the existing system on a 600KB toy graph fitting in the L2 cache.

References

[n.d.]. Laboratory for Web Algorithmcs. http://law.di.unimi.it/datasets.php.Google Scholar
Sami Abu-El-Haija, Bryan Perozzi, Rami Al-Rfou, and Alexander A Alemi. 2018. Watch your step: Learning node embeddings via graph attention. In Advances in Neural Information Processing Systems. 9180--9190.Google Scholar
Alibaba. 2020. Euler. https://github.com/alibaba/eulerGoogle Scholar
Lars Backstrom and Jure Leskovec. 2011. Supervised random walks: predicting and recommending links in social networks. In WSDM. 635--644.Google Scholar
Ziv Bar-Yossef, Alexander Berg, Steve Chien, Jittat Fakcharoenphol, and Dror Weitz. 2000. Approximating aggregate queries about web pages via random walks. In VLDB. 535--544.Google Scholar
Scott Beamer, Krste Asanović, and David Patterson. 2017. Reducing pagerank communication via propagation blocking. In 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 820--831.Google ScholarCross Ref
Paolo Boldi, Massimo Santini, and Sebastiano Vigna. 2008. A large time-aware web graph. SIGIR Forum 42, 2 (2008), 33--38.Google ScholarDigital Library
Hongyun Cai, Vincent W Zheng, and Kevin Chen-Chuan Chang. 2018. A comprehensive survey of graph embedding: Problems, techniques, and applications. IEEE Transactions on Knowledge and Data Engineering 30, 9 (2018), 1616--1637.Google ScholarDigital Library
Shaosheng Cao, Wei Lu, and Qiongkai Xu. 2015. Grarep: Learning graph representations with global structural information. In Proceedings of the 24th ACM international on conference on information and knowledge management. 891--900.Google ScholarDigital Library
Shaosheng Cao, Wei Lu, and Qiongkai Xu. 2016. Deep neural networks for learning graph representations.. In AAAI, Vol. 16. 1145--1152.Google Scholar
Riccardo Cappuzzo, Paolo Papotti, and Saravanan Thirumuruganathan. 2020. Creating embeddings of heterogeneous relational datasets for data integration tasks. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 1335--1349.Google ScholarDigital Library
Sandro Cavallari, Vincent W Zheng, Hongyun Cai, Kevin Chen-Chuan Chang, and Erik Cambria. 2017. Learning community embedding with community detection and node embedding on graphs. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. 377--386.Google ScholarDigital Library
Rong Chen, Jiaxin Shi, Yanzhe Chen, and Haibo Chen. 2015. Power-Lyra: Differentiated graph computation and partitioning on skewed graphs. In Proceedings of the Tenth European Conference on Computer Systems. 1--15.Google ScholarDigital Library
Xing Chen, MingXi Liu, and GuiYing Yan. 2012. Drug-target interaction prediction by random walk on the heterogeneous network. Molecular BioSystems 8, 7 (2012), 1970--1978.Google ScholarCross Ref
Yan-Hao Chen, Ari B. Hayes, Chi Zhang, Timothy Salmon, and Eddy Z. Zhang. 2018. Locality-aware software throttling for sparse matrix operation on GPUs. In 2018 USENIX Annual Technical Conference (USENIX ATC 18). 413--426.Google Scholar
Robert M Christley, GL Pinchbeck, Roger G Bowers, Damian Clancy, Nigel P French, Rachel Bennett, and Joanne Turner. 2005. Infection in social networks: using network analysis to identify high-risk individuals. American journal of epidemiology 162, 10 (2005), 1024--1031.Google Scholar
Colin Cooper, Sang Hyuk Lee, Tomasz Radzik, and Yiannis Siantos. 2014. Random walks in recommender systems: exact computation and simulations. In Proceedings of the 23rd International Conference on World Wide Web. 811--816.Google ScholarDigital Library
Peng Cui, Xiao Wang, Jian Pei, and Wenwu Zhu. 2018. A survey on network embedding. IEEE Transactions on Knowledge and Data Engineering 31, 5 (2018), 833--852.Google ScholarCross Ref
Quanyu Dai, Qiang Li, Jian Tang, and Dan Wang. 2018. Adversarial network embedding. In 32nd AAAI Conference on Artificial Intelligence. 2167--2174.Google ScholarCross Ref
Arjun Dasgupta, Gautam Das, and Heikki Mannila. 2007. A random walk approach to sampling hidden databases. In Proceedings of the 2007 ACM SIGMOD international conference on Management of data. 629--640.Google ScholarDigital Library
Jeffrey Dean and Sanjay Ghemawat. 2008. MapReduce: Simplified data processing on large clusters. Commun. ACM 51, 1 (2008), 107--113.Google ScholarDigital Library
Luc Devroye. 2006. Nonuniform random variate generation. Handbooks in operations research and management science 13 (2006), 83--121.Google Scholar
Laxman Dhulipala, Charles McGuffey, Hongbo Kang, Yan Gu, Guy E. Blelloch, Phillip B. Gibbons, and Julian Shun. 2020. Sage: parallel semi-asymmetric graph algorithms for NVRAMs. 13, 9 (2020), 1598--1613.Google Scholar
Krzysztof Dudziński and Stanisław Walukiewicz. 1987. Exact methods for the knapsack problem and its generalizations. European Journal of Operational Research 28, 1 (1987), 3--21.Google ScholarCross Ref
Peter Ebbes, Zan Huang, Arvind Rangaswamy, et al. 2010. Subgraph sampling methods for social networks: The good, the bad, and the ugly. Technical Report.Google Scholar
Alessandro Epasto and Bryan Perozzi. 2019. Is a single embedding enough? Learning node representations that capture multiple social contexts. In The World Wide Web Conference. 394--404.Google ScholarDigital Library
Wenqi Fan, Yao Ma, Qing Li, Yuan He, Eric Zhao, Jiliang Tang, and Dawei Yin. 2019. Graph neural networks for social recommendation. In The World Wide Web Conference. 417--426.Google ScholarDigital Library
Prasun Gera, Hyojong Kim, Piyush Sao, Hyesoon Kim, and David Bader. 2020. Traversing large graphs on GPUs with unified memory. Proceedings of the VLDB Endowment 13, 7 (2020), 1119--1133.Google ScholarDigital Library
Gurbinder Gill, Roshan Dathathri, Loc Hoang, Ramesh Peri, and Keshav Pingali. 2020. Single machine graph analytics on massive datasets using Intel optane DC persistent memory. In Proceedings of the VLDB Endowment, Vol. 13. 1304--1318.Google ScholarDigital Library
Gurbinder Gill, Roshan Dathathri, Saeed Maleki, Madan Musuvathi, Todd Mytkowicz, and Olli Saarikivi. 2021. Distributed training of embeddings using graph analytics. In 2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS). 973--983.Google ScholarCross Ref
Minas Gjoka, Maciej Kurant, Carter T Butts, and Athina Markopoulou. 2010. Walking in facebook: A case study of unbiased sampling of osns. In IEEE INFOCOM. IEEE, 1--9.Google Scholar
Joseph E. Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, and Carlos Guestrin. 2012. PowerGraph: Distributed graph-parallel computation on natural graphs. In the Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI 12). Hollywood, CA, 17--30.Google ScholarDigital Library
Palash Goyal and Emilio Ferrara. 2018. Graph embedding techniques, applications, and performance: A survey. Knowledge-Based Systems 151 (2018), 78--94.Google ScholarCross Ref
Aditya Grover and Jure Leskovec. 2016. Node2vec on Spark. https://github.com/aditya-grover/node2vec.Google Scholar
Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 855--864.Google ScholarDigital Library
William L Hamilton, Rex Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Proceedings of the 31st International Conference on Neural Information Processing Systems. 1025--1035.Google Scholar
William L. Hamilton, Rex Ying, and Jure Leskovec. 2017. Representation learning on graphs: Methods and applications. IEEE Data(base) Engineering Bulletin 40 (2017), 52--74.Google Scholar
Wook-Shin Han, Sangyeon Lee, Kyungyeol Park, Jeong-Hoon Lee, Min-Soo Kim, Jinha Kim, and Hwanjo Yu. 2013. TurboGraph: A fast parallel graph engine handling billion-scale graphs in a single PC. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. 77--85.Google ScholarDigital Library
Intel. 2009. VTune Performance Analyzer. https://software.intel.com/content/www/us/en/develop/home. html.Google Scholar
Intel. 2020. Second Generation Intel Xeon Scalable Processors. https://www. intel.com/content/dam/www/public/us/en/documents/product-briefs/2nd-gen-xeon-scalable-processors-brief-Feb-2020-2.pdf.Google Scholar
Anand Padmanabha Iyer, Zaoxing Liu, Xin Jin, Shivaram Venkataraman, Vladimir Braverman, and Ion Stoica. 2018. ASAP: Fast, approximate graph pattern mining at scale. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). 745--761.Google Scholar
Abhinav Jangda, Sandeep Polisetty, Arjun Guha, and Marco Serafini. 2021. Accelerating graph sampling for graph machine learning using GPUs. In Proceedings of the 16th European Conference on Computer Systems. ACM, 311--326.Google ScholarDigital Library
Glen Jeh and Jennifer Widom. 2002. SimRank: A measure of structural-context similarity. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 538--543.Google ScholarDigital Library
Jinyuan Jia, Binghui Wang, and Neil Zhenqiang Gong. 2017. Random walk based fake account detection in online social networks. In 2017 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). IEEE, 273--284.Google ScholarCross Ref
Zhihao Jia, Yongkee Kwon, Galen Shipman, Pat McCormick, Mattan Erez, and Alex Aiken. 2017. A distributed multi-GPU system for fast graph processing. Proceedings of the VLDB Endowment 11, 3 (2017), 297--310.Google ScholarDigital Library
Zhihao Jia, Sina Lin, Mingyu Gao, Matei Zaharia, and Alex Aiken. 2020. Improving the accuracy, scalability, and performance of graph neural networks with ROC. Proceedings of Machine Learning and Systems 2 (2020), 187--198.Google Scholar
Nadav Kashtan, Shalev Itzkovitz, Ron Milo, and Uri Alon. 2004. Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs. Bioinformatics 20, 11 (2004), 1746--1758.Google ScholarDigital Library
Hans Kellerer, Ulrich Pferschy, and David Pisinger. 2004. The multiple-choice knapsack problem. In Knapsack Problems. Springer, 317--347.Google Scholar
Maciej Kurant, Athina Markopoulou, and Patrick Thiran. 2010. On the bias of BFS (breadth first search). In 2010 22nd International Teletraffic Congress (lTC 22). IEEE, 1--8.Google ScholarCross Ref
Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. 2010. What is Twitter, a social network or a news media?. In Proceedings of the 19th international conference on World Wide Web. ACM, 591--600.Google ScholarDigital Library
Aapo Kyrola. 2013. Drunkardmob: Billions of random walks on just a PC. In Proceedings of the 7th ACM conference on Recommender systems. 257--264.Google ScholarDigital Library
Aapo Kyrola, Guy Blelloch, and Carlos Guestrin. 2012. Graphchi: Large-scale graph computation on just a PC. In 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI 12). 31--46.Google Scholar
Sangkeun Lee, Sang-il Song, Minsuk Kahng, Dongjoo Lee, and Sang-goo Lee. 2011. Random walk based entity ranking on graph for multidimensional recommendation. In Proceedings of the fifth ACM conference on Recommender systems. 93--100.Google ScholarDigital Library
Jure Leskovec and Andrej Krevl. 2014. SNAP Datasets: Stanford Large Network Dataset Collection. http://snap.stanford.edu/data.Google Scholar
LinkedIn. 2017. Random Walks on Large Scale Graphs with Apache Spark. https://www.slideshare.net/databricks/random-walks-on-large-scale-graphs-with-apache-spark-with-min-shen, Last accessed on 2020-12-10.Google Scholar
Linux. 2009. perf. https://perf.wiki.kernel.org/.Google Scholar
László Lovász et al. 1993. Random walks on graphs: A survey. Combinatorics, Paul erdos is eighty 2, 1 (1993), 1--46.Google Scholar
Kathy Macropol, Tolga Can, and Ambuj K Singh. 2009. RRW: Repeated random walks on genome-scale protein networks for local cluster discovery. BMC bioinformatics 10, 1 (2009), 283.Google Scholar
Grzegorz Malewicz, Matthew H Austern, Aart JC Bik, James C Dehnert, Ilan Horn, Naty Leiser, and Grzegorz Czajkowski. 2010. Pregel: A system for large-scale graph processing. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of data. 135--146.Google ScholarDigital Library
Jasmina Malicevic, Subramanya Dulloor, Narayanan Sundaram, Nadathur Satish, Jeff Jackson, and Willy Zwaenepoel. 2015. Exploiting NVM in large-scale graph analytics. In Proceedings of the 3rd Workshop on Interactions of NVM/FLASH with Operating Systems and Workloads. 1--9.Google ScholarDigital Library
George Marsaglia et al. 2003. Xorshift rngs. Journal of Statistical Software 8, 14 (2003), 1--6.Google Scholar
Laurent Massoulié, Erwan Le Merrer, Anne-Marie Kermarrec, and Ayalvadi Ganesh. 2006. Peer counting and sampling in overlay networks: random walk methods. In Proceedings of the twenty-fifth annual ACM symposium on Principles of distributed computing. 123--132.Google ScholarDigital Library
Makoto Matsumoto and Takuji Nishimura. 1998. Mersenne twister: A 623-dimensionally equidistributed uniform pseudo-random number generator. ACM Transactions on Modeling and Computer Simulation (TOMACS) 8, 1 (1998), 3--30.Google ScholarDigital Library
Alan Mislove, Massimiliano Marcon, Krishna P Gummadi, Peter Druschel, and Bobby Bhattacharjee. 2007. Measurement and analysis of online social networks. In Proceedings of the 7th ACM SIGCOMM conference on Internet measurement. 29--42.Google ScholarDigital Library
Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999. The PageRank citation ranking: Bringing order to the web. Technical Report. Stanford InfoLab.Google Scholar
Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. DeepWalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. 701--710.Google ScholarDigital Library
Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Chi Wang, Kuansan Wang, and Jie Tang. 2019. NetSMF: Large-scale network embedding as sparse matrix factorization. In The World Wide Web Conference. 1509--1520.Google ScholarDigital Library
Jiezhong Qiu, Yuxiao Dong, Hao Ma, Jian Li, Kuansan Wang, and Jie Tang. 2018. Network embedding as matrix factorization: Unifying DeepWalk, LINE, PTE, and node2vec. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. 459--467.Google ScholarDigital Library
Amitabha Roy, Ivo Mihailovic, and Willy Zwaenepoel. 2013. X-stream: Edge-centric graph processing using streaming partitions. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles. 472--488.Google ScholarDigital Library
Yousef Saad. 2003. Iterative methods for sparse linear systems. SIAM.Google Scholar
Harold Herbert Seward. 1954. Information sorting in the application of electronic digital computers to business operations. Ph.D. Dissertation. Massachusetts Institute of Technology. Department of Electrical Engineering.Google Scholar
Mo Sha, Yuchen Li, Bingsheng He, and Kian-Lee Tan. 2017. Accelerating dynamic graph analytics on GPUs. 11, 1 (2017), 107--120.Google Scholar
Mo Sha, Yuchen Li, and Kian-Lee Tan. 2019. GPU-based graph traversal on compressed graphs. In Proceedings of the 2019 International Conference on Management of Data. 775--792.Google ScholarDigital Library
Yingxia Shao, Shiyue Huang, Xupeng Miao, Bin Cui, and Lei Chen. 2020. Memory-aware framework for efficient second-order random walk on large graphs. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 1797--1812.Google ScholarDigital Library
Aneesh Sharma, Jerry Jiang, Praveen Bommannavar, Brian Larson, and Jimmy Lin. 2016. GraphJet: Real-time content recommendations at Twitter. Proceedings of the VLDB Endowment 9, 13 (2016), 1281--1292.Google ScholarDigital Library
Suraj Shetiya, Saravanan Thirumuruganathan, Nick Koudas, and Gautam Das. 2020. Astrid: Accurate selectivity estimation for string predicates using deep learning. Proceedings of the VLDB Endowment 14, 4 (2020), 471--484.Google ScholarDigital Library
Julian Shun and Guy E Blelloch. 2013. Ligra: A lightweight graph processing framework for shared memory. In ACM Sigplan Notices, Vol. 48. ACM, 135--146.Google ScholarDigital Library
Prabhakant Sinha and Andris A Zoltners. 1979. The multiple-choice knapsack problem. Operations Research 27, 3 (1979), 503--515.Google ScholarDigital Library
Don Soltis, Irma Esmer, Adi Yoaz, and Sailesh Kottapalli. 2017. The New Intel Xeon Processor Scalable Family (Formerly Skylake-SP). In IEEE Hot Chips 32 Symposium.Google Scholar
Shixuan Sun, Yuhang Chen, Shengliang Lu, Bingsheng He, and Yuchen Li. 2021. ThunderRW: An in-memory graph random walk engine.. In Proc. VLDB Endow., Vol. 14. 1992--2005.Google ScholarDigital Library
Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei. 2015. LINE: Large-scale information network embedding. In Proceedings of the 24th International Conference on World Wide Web. 1067--1077.Google ScholarDigital Library
Tencent. 2019. Plato. https://github.com/Tencent/platoGoogle Scholar
Saravanan Thirumuruganathan, Nan Tang, Mourad Ouzzani, and AnHai Doan. 2020. Data curation with deep learning.. In EDBT. 277--286.Google Scholar
Ke Tu, Peng Cui, Xiao Wang, Philip S Yu, and Wenwu Zhu. 2018. Deep recursive network embedding with regular equivalence. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2357--2366.Google ScholarDigital Library
Alastair J Walker. 1977. An efficient method for generating discrete random variables with general distributions. ACM Transactions on Mathematical Software (TOMS) 3, 3 (1977), 253--256.Google ScholarDigital Library
Daixin Wang, Peng Cui, and Wenwu Zhu. 2016. Structural deep network embedding. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining. 1225--1234.Google ScholarDigital Library
Hongwei Wang, Jia Wang, Jialin Wang, Miao Zhao, Weinan Zhang, Fuzheng Zhang, Xing Xie, and Minyi Guo. 2018. GraphGAN: Graph representation learning with generative adversarial nets. In Proceedings of the AAAI conference on artificial intelligence, Vol. 32. 2508--2515.Google ScholarCross Ref
Jizhe Wang, Pipei Huang, Huan Zhao, Zhibo Zhang, Binqiang Zhao, and Dik Lun Lee. 2018. Billion-scale commodity embedding for e-commerce recommendation in Alibaba. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 839--848.Google ScholarDigital Library
Rui Wang, Yongkun Li, Hong Xie, Yinlong Xu, and John C. S. Lui. 2020. GraphWalker: An I/O-efficient and resource-friendly graph analytic system for fast and scalable random walks. In 2020 USENIX Annual Technical Conference (USENIX ATC 20). 559--571.Google ScholarDigital Library
Xiao Wang, Peng Cui, Jing Wang, Jian Pei, Wenwu Zhu, and Shiqiang Yang. 2017. Community preserving network embedding.. In AAAI, Vol. 17. 203--209.Google Scholar
Wanjing Wei, Yangzihao Wang, Pin Gao, Shijie Sun, and Donghai Yu. 2020. A distributed multi-GPU system for large-scale node embedding at Tencent. arXiv preprint arXiv:2005.13789 (2020).Google Scholar
Yahoo! 2002. Yahoo! AltaVista Web Page Hyperlink Connectivity Graph. https://webscope.sandbox.yahoo.com/catalog.php?datatype=gGoogle Scholar
Jaewon Yang and Jure Leskovec. 2015. Defining and evaluating network communities based on ground-truth. Knowledge and Information Systems 42, 1 (2015), 181--213.Google ScholarDigital Library
Ke Yang, MingXing Zhang, Kang Chen, Xiaosong Ma, Yang Bai, and Yong Jiang. 2019. KnightKing: A fast distributed graph random walk engine. In Proceedings of the 27th ACM Symposium on Operating Systems Principles. 524--537.Google ScholarDigital Library
Renchi Yang, Jieming Shi, Xiaokui Xiao, Yin Yang, Juncheng Liu, and Sourav S. Bhowmick. 2020. Scaling attributed network embedding to massive graphs. In Proceedings of the VLDB Endowment, Vol. 14. 37--49.Google ScholarDigital Library
Dalong Zhang, Xin Huang, Ziqi Liu, Jun Zhou, Zhiyang Hu, Xianzheng Song, Zhibang Ge, Lin Wang, Zhiqiang Zhang, and Yuan Qi. 2020. AGL: A scalable system for industrial-purpose graph machine learning. 13, 12 (2020), 3125--3137.Google ScholarDigital Library
Yunming Zhang, Vladimir Kiriansky, Charith Mendis, Saman Amarasinghe, and Matei Zaharia. 2017. Making caches work for graph analytics. In 2017 IEEE International Conference on Big Data (Big Data). IEEE, 293--302.Google ScholarCross Ref
Dongyan Zhou, Songjie Niu, and Shimin Chen. 2018. Efficient graph computation for node2vec. arXiv preprint arXiv:1805.00280 (2018).Google Scholar
Xiaowei Zhu, Wenguang Chen, Weimin Zheng, and Xiaosong Ma. 2016. Gemini: A computation-centric distributed graph processing system. In the Proceedings of 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16). USENIX Association, Savannah, GA, 301--316.Google Scholar
Xiaojin Zhu, Andrew B Goldberg, Jurgen Van Gael, and David Andrzejewski. 2007. Improving diversity in ranking using absorbing random walks. In Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference. 97--104.Google Scholar
Xiaowei Zhu, Wentao Han, and Wenguang Chen. 2015. GridGraph: Large-scale graph processing on a single machine using 2-level hierarchical partitioning. In USENIX ATC '15 Proceedings of the 2015 USENIX Conference on Usenix Annual Technical Conference. 375--386.Google Scholar
Zhaocheng Zhu, Shizhen Xu, Jian Tang, and Meng Qu. 2019. GraphVite: A high-performance CPU-GPU hybrid system for node embedding. In The World Wide Web Conference. 2494--2504.Google ScholarDigital Library

Index Terms

Random Walks on Huge Graphs at Cache Efficiency

Recommendations

Coordinating DRAM and Last-Level-Cache Policies with the Virtual Write Queue

To alleviate bottlenecks in this era of many-core architectures, the authors propose a virtual write queue to expand the memory controller's scheduling window through visibility of cache behavior. Awareness of the physical main memory layout and a focus ...
Read More
Short Random Walks on Graphs

The short-term behavior of random walks on graphs is studied, in particular, the rate at which a random walk discovers new vertices and edges. A conjecture by Linial that the expected time to find $\cal N$ distinct vertices is $O({\cal N}^{3})$ is ...
Read More
Random walks which prefer unvisited edges.: exploring high girth even degree expanders in linear time.
PODC '12: Proceedings of the 2012 ACM symposium on Principles of distributed computing

In this paper, we consider a modified random walk which uses unvisited edges whenever possible, and makes a simple random walk otherwise. We call such a walk an edge-process (or E-process). We assume there is a rule A, which tells the walk which ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

SOSP '21: Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles
October 2021
899 pages
ISBN:9781450387095
DOI:10.1145/3477132

Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 26 October 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Badges
- Artifacts Evaluated & Functional / v1.1
- Artifacts Available / v1.1
Author Tags
cache
graph computing
memory
random walk
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate131of716submissions,18%
Upcoming Conference
SOSP '24

Sponsor:

sigops

ACM SIGOPS 29th Symposium on Operating Systems Principles

November 5 - 8, 2024

Austin , TX , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 7
  Total Citations
  View Citations
- 1,123
  Total Downloads
- Downloads (Last 12 months)229
- Downloads (Last 6 weeks)25
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Random Walks on Huge Graphs at Cache Efficiency

SOSP '21: Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles

ABSTRACT

References

Cited By

Index Terms

Recommendations

Coordinating DRAM and Last-Level-Cache Policies with the Virtual Write Queue

Short Random Walks on Graphs

Random walks which prefer unvisited edges.: exploring high girth even degree expanders in linear time.