Abstract
High efficiency page-rank computation is motivated by issues that bulk synchronous parallel computing model has high-cost synchronous barriers, and asynchronous communication can avoid long-waiting time. By operations of updating inside RDDs, iteration can step into the next round without the barrier of synchronization among all partitions. Experiment results indicate that our method can improve the execution speed significantly compared to Graphx.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2004)
Malewicz, G., Austern, M.H., Bik, A.J.C., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, pp. 135–146. ACM Press (2010)
Low, Y., Gonzalez, J., Kyrola, A., Bickson, D., Guestrin, C., Hellerstein, J.M.: Distributed GraphLab: a framework for machine learning and data mining in the cloud. Proc. VLDB Endow. 5(8), 716–727 (2012)
Tian, Y.Y., Balmin, A., Corsten, S.A., Tatikonda, S., Mcpherson, J.: From “think like a vertex” to “think like a graph”. Proc. VLDB Endow. 7(3), 193–204 (2013)
Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing, Usenix Conference on Networked Systems Design & Implementation, p. 2 (2012)
Baudet, G.M.: Asynchronous iterative methods for multiprocessors. J. ACM 25(2), 226–244 (1978)
Zhang, Y., Gao, Q., Gao, L., Wang, C.R.: Accelerate large-scale iterative computation through asynchronous accumulative updates. In: Proceedings of the 3rd Workshop on Scientific Cloud Computing Date, pp 13–22. ACM Press (2012)
Wang, G., Xie, W., Demers, A., Gehrke, J.: Asynchronous large-scale graph processing made easy. In: Proceedings of Biennial CIDR, pp. 1–12 (2013)
Zhang, Y., Gao, Q., Gao, L., Wang, C.: Maiter: a message-passing distributed framework for accumulative iterative computation. Technical report (2012)
Yu, W., Lin, X., Zhang, W.: Towards efficient SimRank computation on large networks. In: Proceedings of the 2013 IEEE 29th International Conference on Data Engineering (ICDE), pp. 601–612. IEEE (2013)
Stanford Dataset Collection. http://snap.stanford.edu/data/
McSherry, F.: A uniform approach to accelerated pagerank computation. In: Proceedings of the International Conference WWW, pp. 575–582 (2005)
Bertsekas, D.P.: Distributed asynchronous computation of fixed points. Math. Program. 27(1), 107–120 (1983)
Yuan, F., Chang, K., Chen-Chuan, W., Lauw, H.: RoundTripRank: graph-based proximity with importance and specificity. In: Proceedings of the ICED, pp. 613–624 (2013)
Zhang, Y., Gao, Q., Gao, L., Wang, C.R.: Prlter: a distributed framework for prioritized iterative computations. In: Proceedings of the 2nd ACM Symposium on Cloud Computing, p. 1. ACM Press (2011)
Acknowledgments
This work is supported by the Scientific Research Project of Education Department of HuBei Province under Grant no. Q20141410.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Li, C., Chen, J., Yang, Z., Chen, W. (2018). Asynchronous Page-Rank Computation in Spark. In: Barolli, L., Terzo, O. (eds) Complex, Intelligent, and Software Intensive Systems. CISIS 2017. Advances in Intelligent Systems and Computing, vol 611. Springer, Cham. https://doi.org/10.1007/978-3-319-61566-0_52
Download citation
DOI: https://doi.org/10.1007/978-3-319-61566-0_52
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-61565-3
Online ISBN: 978-3-319-61566-0
eBook Packages: EngineeringEngineering (R0)