Asynchronous Page-Rank Computation in Spark

Li, Chao; Chen, JianXia; Yang, Zhi; Chen, WuYan

doi:10.1007/978-3-319-61566-0_52

Chao Li¹⁶,
JianXia Chen¹⁶,
Zhi Yang¹⁶ &
…
WuYan Chen¹⁶

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 611))

Included in the following conference series:

Conference on Complex, Intelligent, and Software Intensive Systems

2328 Accesses

Abstract

High efficiency page-rank computation is motivated by issues that bulk synchronous parallel computing model has high-cost synchronous barriers, and asynchronous communication can avoid long-waiting time. By operations of updating inside RDDs, iteration can step into the next round without the barrier of synchronization among all partitions. Experiment results indicate that our method can improve the execution speed significantly compared to Graphx.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

An Improved/Optimized Practical Non-Blocking PageRank Algorithm for Massive Graphs*

Article 26 March 2022

Performance Evaluation of Big Data Frameworks: MapReduce and Spark

Distributed k-Hop Query Powered by an Asynchronous Framework

References

Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2004)
Article Google Scholar
Malewicz, G., Austern, M.H., Bik, A.J.C., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, pp. 135–146. ACM Press (2010)
Google Scholar
Low, Y., Gonzalez, J., Kyrola, A., Bickson, D., Guestrin, C., Hellerstein, J.M.: Distributed GraphLab: a framework for machine learning and data mining in the cloud. Proc. VLDB Endow. 5(8), 716–727 (2012)
Article Google Scholar
Tian, Y.Y., Balmin, A., Corsten, S.A., Tatikonda, S., Mcpherson, J.: From “think like a vertex” to “think like a graph”. Proc. VLDB Endow. 7(3), 193–204 (2013)
Article Google Scholar
Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing, Usenix Conference on Networked Systems Design & Implementation, p. 2 (2012)
Google Scholar
Baudet, G.M.: Asynchronous iterative methods for multiprocessors. J. ACM 25(2), 226–244 (1978)
Article MathSciNet MATH Google Scholar
Zhang, Y., Gao, Q., Gao, L., Wang, C.R.: Accelerate large-scale iterative computation through asynchronous accumulative updates. In: Proceedings of the 3rd Workshop on Scientific Cloud Computing Date, pp 13–22. ACM Press (2012)
Google Scholar
Wang, G., Xie, W., Demers, A., Gehrke, J.: Asynchronous large-scale graph processing made easy. In: Proceedings of Biennial CIDR, pp. 1–12 (2013)
Google Scholar
Zhang, Y., Gao, Q., Gao, L., Wang, C.: Maiter: a message-passing distributed framework for accumulative iterative computation. Technical report (2012)
Google Scholar
Yu, W., Lin, X., Zhang, W.: Towards efficient SimRank computation on large networks. In: Proceedings of the 2013 IEEE 29th International Conference on Data Engineering (ICDE), pp. 601–612. IEEE (2013)
Google Scholar
Stanford Dataset Collection. http://snap.stanford.edu/data/
McSherry, F.: A uniform approach to accelerated pagerank computation. In: Proceedings of the International Conference WWW, pp. 575–582 (2005)
Google Scholar
Bertsekas, D.P.: Distributed asynchronous computation of fixed points. Math. Program. 27(1), 107–120 (1983)
Article MathSciNet MATH Google Scholar
Yuan, F., Chang, K., Chen-Chuan, W., Lauw, H.: RoundTripRank: graph-based proximity with importance and specificity. In: Proceedings of the ICED, pp. 613–624 (2013)
Google Scholar
Zhang, Y., Gao, Q., Gao, L., Wang, C.R.: Prlter: a distributed framework for prioritized iterative computations. In: Proceedings of the 2nd ACM Symposium on Cloud Computing, p. 1. ACM Press (2011)
Google Scholar

Download references

Acknowledgments

This work is supported by the Scientific Research Project of Education Department of HuBei Province under Grant no. Q20141410.

Author information

Authors and Affiliations

School of Computer, Hubei University of Technology, Wuhan, 430068, China
Chao Li, JianXia Chen, Zhi Yang & WuYan Chen

Authors

Chao Li
View author publications
You can also search for this author in PubMed Google Scholar
JianXia Chen
View author publications
You can also search for this author in PubMed Google Scholar
Zhi Yang
View author publications
You can also search for this author in PubMed Google Scholar
WuYan Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chao Li .

Editor information

Editors and Affiliations

Department of Information and Communication Engineering, Faculty of Information Engineering, Fukuoka Institute of Technology, Fukuoka, Japan
Leonard Barolli
Politècnico di Torino, Istituto Superiore Mario Boella, Turin, Italy
Olivier Terzo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, C., Chen, J., Yang, Z., Chen, W. (2018). Asynchronous Page-Rank Computation in Spark. In: Barolli, L., Terzo, O. (eds) Complex, Intelligent, and Software Intensive Systems. CISIS 2017. Advances in Intelligent Systems and Computing, vol 611. Springer, Cham. https://doi.org/10.1007/978-3-319-61566-0_52

Download citation

DOI: https://doi.org/10.1007/978-3-319-61566-0_52
Published: 05 July 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-61565-3
Online ISBN: 978-3-319-61566-0
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics