ABSTRACT
The CPU cache performance is one of the key issues to efficiency in database systems. It is reported that cache miss latency takes a half of the execution time in database systems. To improve the CPU cache performance, there are studies to support searching including cache-oblivious, and cache-conscious trees. In this paper, we focus on CPU speedup for graph computing in general by reducing the CPU cache miss ratio for different graph algorithms. The approaches dealing with trees are not applicable to graphs which are complex in nature.
In this paper, we explore a general approach to speed up CPU computing, in order to further enhance the efficiency of the graph algorithms without changing the graph algorithms (implementations) and the data structures used. That is, we aim at designing a general solution that is not for a specific graph algorithm, neither for a specific data structure.
The approach studied in this work is graph ordering, which is to find the optimal permutation among all nodes in a given graph by keeping nodes that will be frequently accessed together locally, to minimize the CPU cache miss ratio.
We prove the graph ordering problem is NP-hard, and give a basic algorithm with a bounded approximation. To improve the time complexity of the basic algorithm, we further propose a new algorithm to reduce the time complexity and improve the efficiency with new optimization techniques based on a new data structure.
We conducted extensive experiments to evaluate our approach in comparison with other 9 possible graph orderings (such as the one obtained by METIS) using 8 large real graphs and 9 representative graph algorithms. We confirm that our approach can achieve high performance by reducing the CPU cache miss ratios.
- The Boost Graph Library: User Guide and Reference Manual. Addison-Wesley Longman Publishing Co., Inc., 2002. Google ScholarDigital Library
- A. Ailamaki, D. J. DeWitt, M. D. Hill, and D. A. Wood. Dbmss on a modern processor: Where does time go? In Proc. of VLDB'99, 1999. Google ScholarDigital Library
- L. Auroux, M. Burelle, and R. Erra. Reordering very large graphs for fun & profit. In International Symposium on Web AlGorithms, 2015.Google Scholar
- J. Banerjee, W. Kim, S. Kim, and J. F. Garza. Clustering a DAG for CAD databases. IEEE Trans. Software Eng., 1988. Google ScholarDigital Library
- A. I. Barvinok, D. S. Johnson, G. J. Woeginger, and R. Woodroofe. The maximum traveling salesman problem under polyhedral norms. In Proc. of IPCO'98, 1998.Google ScholarCross Ref
- V. Batagelj and M. Zaversnik. An o(m) algorithm for cores decomposition of networks. CoRR, cs.DS/0310049, 2003.Google Scholar
- P. Boldi, M. Rosa, M. Santini, and S. Vigna. Layered label propagation: a multiresolution coordinate-free ordering for compressing social networks. In Proc. of WWW'11, 2011. Google ScholarDigital Library
- P. Boldi, M. Santini, and S. Vigna. Permuting web graphs. In Proc. of WAW'09, 2009. Google ScholarDigital Library
- S. Borkar, P. Dubey, K. Kahn, D. Kuck, H. Mulder, S. Pawlowski, and J. Rattner. Platform 2015: Intel processor and platform evolution for the next decade. Technology, 2005.Google Scholar
- L. Chang, J. X. Yu, L. Qin, X. Lin, C. Liu, and W. Liang. Efficiently computing k-edge connected components via graph decomposition. In Proc. of SIGMOD'13, 2013. Google ScholarDigital Library
- M. Charikar, M. T. Hajiaghayi, H. Karloff, and S. Rao. l2 spreading metrics for vertex ordering problems. In Proc. of SODA'06, 2006. Google ScholarDigital Library
- S. Chen, A. Ailamaki, P. B. Gibbons, and T. C. Mowry. Improving hash join performance through prefetching. In Proc. of ICDE'04, 2004. Google ScholarDigital Library
- S. Chen, P. B. Gibbons, and T. C. Mowry. Improving index performance through prefetching. In Proc. of SIGMOD'01, 2001. Google ScholarDigital Library
- F. Chierichetti, R. Kumar, S. Lattanzi, M. Mitzenmacher, A. Panconesi, and P. Raghavan. On compressing social networks. In Proc. of KDD'09, 2009. Google ScholarDigital Library
- T. M. Chilimbi, M. D. Hill, and J. R. Larus. Cache-conscious structure layout. In Proceedings of PLDI, Atlanta, Georgia, USA, 1999. Google ScholarDigital Library
- P. Z. Chinn, J. Chvatalova, A. K. Dewdney, and N. E. Gibbs. The bandwidth problem for graphs and matrices - a survey. Journal of Graph Theory, 6(3), 1982.Google ScholarCross Ref
- V. Chvatal. A greedy heuristic for the set-covering problem. Mathematics of operations research, 4(3), 1979. Google ScholarDigital Library
- J. Cieslewicz and K. Ross. Database optimizations for modern hardware. Proc of the IEEE, 96(5), 2008.Google ScholarCross Ref
- E. Cockayne. Domination of undirected graphs - a survey. In Theory and Applications of Graphs, pages 141--147. Springer, 1978.Google ScholarCross Ref
- T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein. Introduction to algorithms. MIT press Cambridge, 2 edition, 2001. Google ScholarDigital Library
- M. Fisher, G. Nemhauser, and L. Wolsey. An analysis of approximations for finding a maximum weight hamiltonian circuit. Operations Research, 27(4), 1979. Google ScholarDigital Library
- A. George and J. W. Liu. Computer solution of large sparse positive definite. 1981. Google ScholarDigital Library
- A. Ghoting, G. Buehrer, S. Parthasarathy, D. Kim, A. Nguyen, Y.-K. Chen, and P. Dubey. Cache-conscious frequent pattern mining on a modern processor. In Proc. of VLDB'05, 2005. Google ScholarDigital Library
- N. Z. Gong, W. Xu, L. Huang, P. Mittal, E. Stefanov, V. Sekar, and D. Song. Evolution of social-attribute networks: measurements, modeling, and implications using google+. In Proc. of IMC'12, 2012. Google ScholarDigital Library
- J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin. Powergraph: Distributed graph-parallel computation on natural graphs. In OSDI Hollywood, CA, USA, 2012. Google ScholarDigital Library
- C. M. Grinstead and J. L. Snell. Introduction to probability. American Mathematical Soc., 2012.Google Scholar
- L. H. Harper. Optimal assignments of numbers to vertices. Journal of the Society for Industrial and Applied Mathematics, 1964.Google ScholarCross Ref
- R. Hassin and S. Rubinstein. An approximation algorithm for the maximum traveling salesman problem. Inf. Process. Lett., 67(3), 1998. Google ScholarDigital Library
- U. Kang and C. Faloutsos. Beyond 'caveman communities': Hubs and spokes for graph compression and mining. In Proc. of ICDM'11, 2011. Google ScholarDigital Library
- G. Karypis and V. Kumar. Multilevel k-way partitioning scheme for irregular graphs. J. Parallel Distrib. Comput., 48(1), 1998. Google ScholarDigital Library
- M. G. Kendall. A new measure of rank correlation. Biometrika, 1938.Google Scholar
- Y. Koren and D. Harel. A multi-scale algorithm for the linear arrangement problem. In Graph-Theoretic Concepts in Computer Science. Springer, 2002. Google ScholarDigital Library
- A. Kyrola, G. E. Blelloch, and C. Guestrin. Graphchi: Large-scale graph computation on just a PC. In OSDI Hollywood, CA, USA, 2012. Google ScholarDigital Library
- J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney. Statistical properties of community structure in large social and information networks. In Proc. of WWW'08, 2008. Google ScholarDigital Library
- P. Lindstrom and D. Rajan. Optimal hierarchical layouts for cache-oblivious search trees. In Proc. of ICDE'14, 2014.Google ScholarCross Ref
- C.-K. Luk and T. C. Mowry. Compiler-based prefetching for recursive data structures. In Proc. of ASPLOS'96, 1996. Google ScholarDigital Library
- The Apache Software Foundation. Giraph website. http://giraph.apache.org.Google Scholar
- A. O. Mendelzon and C. G. Mendioroz. Graph clustering and caching. In Computer Science 2. Springer, 1994. Google ScholarDigital Library
- L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. 1999.Google Scholar
- J. Park, M. Penner, and V. K. Prasanna. Optimizing graph algorithms for improved cache performance. IEEE Trans. Parallel Distrib. Syst., 15(9), 2004. Google ScholarDigital Library
- J. Petit. Experiments on the minimum linear arrangement problem. Journal of Experimental Algorithmics (JEA), 2003. Google ScholarDigital Library
- J. Rao and K. A. Ross. Cache conscious indexing for decision-support in main memory. In Proc. of VLDB'99, 1999. Google ScholarDigital Library
- J. Rao and K. A. Ross. Making b+- trees cache conscious in main memory. In Proc. of SIGMOD'00, 2000. Google ScholarDigital Library
- I. Safro, D. Ron, and A. Brandt. Multilevel algorithms for linear ordering problems. Journal of Experimental Algorithmics (JEA), 2009. Google ScholarDigital Library
- I. Safro and B. Temkin. Multiscale approach for the network compression-friendly ordering. Journal of Discrete Algorithms, 2011. Google ScholarDigital Library
- A. I. Serdyukov. An algorithm with an estimate for the traveling salesman problem of the maximum. Upravlyaemye Sistemy, 25:80--86, 1984.Google Scholar
- Y. Shao, B. Cui, and L. Ma. PAGE: A partition aware engine for parallel graph computation. IEEE Trans. Knowl. Data Eng., 27(2), 2015.Google ScholarCross Ref
- M. Sharir. A strong-connectivity algorithm and its applications in data flow analysis. Computers & Mathematics with Applications, 7(1), 1981.Google Scholar
- I. Stanton and G. Kliot. Streaming graph partitioning for large distributed graphs. In Proc. of KDD'12, 2012. Google ScholarDigital Library
- M. Then, M. Kaufmann, F. Chirigati, T.-A. Hoang-Vu, K. Pham, A. Kemper, T. Neumann, and H. T. Vo. The more the merrier: Efficient multi-source graph traversal. PVLDB, 8(4), 2014. Google ScholarDigital Library
- Y. Tian, A. Balmin, S. A. Corsten, S. Tatikonda, and J. McPherson. From "think like a vertex" to "think like a graph". PVLDB, 7(3), 2013. Google ScholarDigital Library
- W. Xie, G. Wang, D. Bindel, A. Demers, and J. Gehrke. Fast iterative graph computation with block updates. PVLDB, 6(14), 2013. Google ScholarDigital Library
- D. Yan, J. Cheng, Y. Lu, and W. Ng. Blogel: A block-centric framework for distributed computation on real-world graphs. PVLDB, 7(14), 2014. Google ScholarDigital Library
Index Terms
- Speedup Graph Processing by Graph Ordering
Recommendations
Line (block) size choice for CPU cache memories
The line (block) size of a cache memory is one of the parameters that most strongly affects cache performance. In this paper, we study the factors that relate to the selection of a cache line size. Our primary focus is on the cache miss ratio, but we ...
Accelerating Depth-First Traversal by Graph Ordering
SSDBM '21: Proceedings of the 33rd International Conference on Scientific and Statistical Database ManagementCache efficiency is an important factor in the performance of graph processing due to the irregular memory access patterns caused by the sparse nature of graphs. To increase the cache hit rate, prior studies proposed a variety of preprocessing ...
Finding a chain graph in a bipartite permutation graph
We present a polynomial-time algorithm for solving Subgraph Isomorphism where the base graphs are bipartite permutation graphs and the pattern graphs are chain graphs. Subgraph Isomorphism is studied on graph classes.A polynomial-time algorithm is given ...
Comments