Skip to main content

Asynchronous Page-Rank Computation in Spark

  • Conference paper
  • First Online:
Complex, Intelligent, and Software Intensive Systems (CISIS 2017)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 611))

Included in the following conference series:

  • 2228 Accesses

Abstract

High efficiency page-rank computation is motivated by issues that bulk synchronous parallel computing model has high-cost synchronous barriers, and asynchronous communication can avoid long-waiting time. By operations of updating inside RDDs, iteration can step into the next round without the barrier of synchronization among all partitions. Experiment results indicate that our method can improve the execution speed significantly compared to Graphx.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2004)

    Article  Google Scholar 

  2. Malewicz, G., Austern, M.H., Bik, A.J.C., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, pp. 135–146. ACM Press (2010)

    Google Scholar 

  3. Low, Y., Gonzalez, J., Kyrola, A., Bickson, D., Guestrin, C., Hellerstein, J.M.: Distributed GraphLab: a framework for machine learning and data mining in the cloud. Proc. VLDB Endow. 5(8), 716–727 (2012)

    Article  Google Scholar 

  4. Tian, Y.Y., Balmin, A., Corsten, S.A., Tatikonda, S., Mcpherson, J.: From “think like a vertex” to “think like a graph”. Proc. VLDB Endow. 7(3), 193–204 (2013)

    Article  Google Scholar 

  5. Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing, Usenix Conference on Networked Systems Design & Implementation, p. 2 (2012)

    Google Scholar 

  6. Baudet, G.M.: Asynchronous iterative methods for multiprocessors. J. ACM 25(2), 226–244 (1978)

    Article  MathSciNet  MATH  Google Scholar 

  7. Zhang, Y., Gao, Q., Gao, L., Wang, C.R.: Accelerate large-scale iterative computation through asynchronous accumulative updates. In: Proceedings of the 3rd Workshop on Scientific Cloud Computing Date, pp 13–22. ACM Press (2012)

    Google Scholar 

  8. Wang, G., Xie, W., Demers, A., Gehrke, J.: Asynchronous large-scale graph processing made easy. In: Proceedings of Biennial CIDR, pp. 1–12 (2013)

    Google Scholar 

  9. Zhang, Y., Gao, Q., Gao, L., Wang, C.: Maiter: a message-passing distributed framework for accumulative iterative computation. Technical report (2012)

    Google Scholar 

  10. Yu, W., Lin, X., Zhang, W.: Towards efficient SimRank computation on large networks. In: Proceedings of the 2013 IEEE 29th International Conference on Data Engineering (ICDE), pp. 601–612. IEEE (2013)

    Google Scholar 

  11. Stanford Dataset Collection. http://snap.stanford.edu/data/

  12. McSherry, F.: A uniform approach to accelerated pagerank computation. In: Proceedings of the International Conference WWW, pp. 575–582 (2005)

    Google Scholar 

  13. Bertsekas, D.P.: Distributed asynchronous computation of fixed points. Math. Program. 27(1), 107–120 (1983)

    Article  MathSciNet  MATH  Google Scholar 

  14. Yuan, F., Chang, K., Chen-Chuan, W., Lauw, H.: RoundTripRank: graph-based proximity with importance and specificity. In: Proceedings of the ICED, pp. 613–624 (2013)

    Google Scholar 

  15. Zhang, Y., Gao, Q., Gao, L., Wang, C.R.: Prlter: a distributed framework for prioritized iterative computations. In: Proceedings of the 2nd ACM Symposium on Cloud Computing, p. 1. ACM Press (2011)

    Google Scholar 

Download references

Acknowledgments

This work is supported by the Scientific Research Project of Education Department of HuBei Province under Grant no. Q20141410.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chao Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Cite this paper

Li, C., Chen, J., Yang, Z., Chen, W. (2018). Asynchronous Page-Rank Computation in Spark. In: Barolli, L., Terzo, O. (eds) Complex, Intelligent, and Software Intensive Systems. CISIS 2017. Advances in Intelligent Systems and Computing, vol 611. Springer, Cham. https://doi.org/10.1007/978-3-319-61566-0_52

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-61566-0_52

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-61565-3

  • Online ISBN: 978-3-319-61566-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics