Abstract
The max-flow min-cut problem is one of the most explored and studied problems in the area of combinatorial algorithms and optimization. In this paper, we solve the max-flow min-cut problem on large random graphs with log-normal distribution of outdegrees using the distributed Edmonds-Karp algorithm. The algorithm is implemented on a cluster using Spark. We compare the runtime between a single machine implementation and cluster implementation and analyze the impact of communication cost on runtime. In our experiments, we observe that the practical value recorded across various graphs is much lesser than the theoretical estimations primarily due to smaller diameter of the graph. Additionally, we extend this model theoretically on a large urban road network to evaluate the minimum number of sensors required for surveillance of the entire network. To validate the feasibility of this theoretical extension, we tested the model with a large log-normal graph with \(\sim \)1.1 million edges and obtained a max-flow value of 54, which implies that the minimum-cut set of the graph consists of 54 edges. This is a reasonable set of edges to place the sensors compared to the total number of edges. We believe that our approach can enhance the safety of road networks throughout the world.
V. Ramesh and S. Nagarajan contributed equally.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ahuja, R.K., Magnanti, T.L., Orlin, J.B.: Network flows: theory, algorithms, and applications (1993)
Badics, T., Boros, E.: Implementing a maximum flow algorithm: experiments with dynamic trees. Netw. Flows Matching First DIMACS Implement. Chall. 12, 43 (1993)
Barnett, R.L., Sean Bovey, D., Atwell, R.J., Anderson, L.B.: Application of the maximum flow problem to sensor placement on urban road networks for homeland security. Homel. Secur. Aff. 3(3), 1–15 (2007)
Cheriyan, J., Maheshwari, S.N.: Analysis of preflow push algorithms for maximum network flow. SIAM J. Comput. 18(6), 1057–1086 (1989)
Cherkassky, B.V., Goldberg, A.V.: On implementing the push-relabel method for the maximum flow problem. Algorithmica 19(4), 390–410 (1997)
Crobak, J.R., Berry, J.W., Madduri, K., Bader, D.A.: Advanced shortest paths algorithms on a massively-multithreaded architecture. In: 2007 IEEE International Parallel and Distributed Processing Symposium, pp. 1–8. IEEE (2007)
Dancoisne, B., Dupont, E., Zhang, W.: Distributed max-flow in spark (2015)
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)
Dinic, E.A.: Algorithm for solution of a problem of maximum flow in a network with power estimation. Sov. Math. Dokl. 11(5), 1277–1280 (1970)
Edmonds, J., Karp, R.M.: Theoretical improvements in algorithmic efficiency for network flow problems. J. ACM (JACM) 19(2), 248–264 (1972)
Flake, G.W., Lawrence, S., Giles, C.L.: Efficient identification of web communities. In: Proceedings of the sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 150–160. ACM (2000)
Ford, L.R., Fulkerson, D.R.: Maximal flow through a network. Can. J. Math. 8(3), 399–404 (1956)
Goldberg, A.V.: Efficient graph algorithms for sequential and parallel computers. Ph.D. thesis, Massachusetts Instutute of Technology, Department of Electrical Engineering and Computer Science (1987)
Goldberg, A.V.: Recent developments in maximum flow algorithms. In: Arnborg, S., Ivansson, L. (eds.) SWAT 1998. LNCS, vol. 1432, pp. 1–10. Springer, Heidelberg (1998). doi:10.1007/BFb0054350
Goldberg, A.V., Rao, S.: Beyond the flow decomposition barrier. J. ACM (JACM) 45(5), 783–797 (1998)
Goldberg, A.V., Tarjan, R.E.: A new approach to the maximum-flow problem. J. ACM (JACM) 35(4), 921–940 (1988)
Gonzalez, J.E., Xin, R.S., Dave, A., Crankshaw, D., Franklin, M.J., Stoica, I.: GraphX: graph processing in a distributed dataflow framework. In: 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2014), pp. 599–613 (2014)
Lei, G., Li, H.: Memory or time: performance evaluation for iterative operation on Hadoop and Spark. In: 2013 IEEE 10th International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing (HPCC_EUC), pp. 721–727. IEEE (2013)
Apache Hadoop: Hadoop (2009)
Halim, F., Yap, R.H., Yongzheng, W.: A MapReduce-based maximum-flow algorithm for large small-world network graphs. In: 2011 31st International Conference on Distributed Computing Systems (ICDCS), pp. 192–202. IEEE (2011)
Kang, U., Tsourakakis, C.E., Faloutsos, C.: PEGASUS: a peta-scale graph mining system implementation and observations. In: 2009 Ninth IEEE International Conference on Data Mining, pp. 229–238. IEEE (2009)
Kulkarni, M., Burtscher, M., Inkulu, R., Pingali, K., Casçaval, C.: How much parallelism is there in irregular applications? In: ACM Sigplan Notices, vol. 44, pp. 3–14. ACM (2009)
Leskovec, J., Kleinberg, J., Faloutsos, C.: Graphs over time: densification laws, shrinking diameters and possible explanations. In: Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, pp. 177–187. ACM (2005)
Leskovec, J., Krevl, A.: SNAP Datasets: Stanford large network dataset collection, June 2014. http://snap.stanford.edu/data
Malewicz, G., Austern, M.H., Bik, A.J., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, pp. 135–146. ACM (2010)
Meyer, U., Sanders, P.: \(\delta \)-stepping: a parallelizable shortest path algorithm. J. Algorithm. 49(1), 114–152 (2003)
Otsuki, K., Kobayashi, Y., Murota, K.: Improved max-flow min-cut algorithms in a circular disk failure model with application to a road network. Eur. J. Oper. Res. 248(2), 396–403 (2016)
Saito, H., Toyoda, M., Kitsuregawa, M., Aihara, K.: A large-scale study of link spam detection by graph algorithms. In: Proceedings of the 3rd International Workshop on Adversarial Information Retrieval on the Web, pp. 45–48. ACM (2007)
Apache Spark: Apache sparkâ„¢ is a fast and general engine for large-scale data processing (2016)
Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. HotCloud 10, 10 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Ramesh, V., Nagarajan, S., Mukherjee, S. (2017). Max-flow Min-cut Algorithm in Spark with Application to Road Networks. In: Jung, J., Kim, P. (eds) Big Data Technologies and Applications. BDTA 2016. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 194. Springer, Cham. https://doi.org/10.1007/978-3-319-58967-1_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-58967-1_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-58966-4
Online ISBN: 978-3-319-58967-1
eBook Packages: Computer ScienceComputer Science (R0)