Abstract
A link analysis on a distribute system is a viable choice to evaluate relationships between web-pages in a large web-graph. Each computational processor in the system contains a partial local web-graph and it locally performs web ranking. Since a distributed web ranking is generally incur penalties on execution times and accuracy from data synchronization, a web-graph can preliminary partitioned with a desired structure before a link analysis algorithm is started to improve execution time and accuracy. However, in the real-word situation, the numbers of web-pages in the web-graph can be continuously increased. Therefore, a link analysis algorithm has to re-partition a web-graph and re-perform web-pages ranking every time when the new web-pages are collected. In this paper, an efficient distributed web-pages ranking algorithm with min-cut density-balanced partitioning is proposed to improve the execution time of this scenario. The algorithm will re-partition the web-graph and re-perform the web-pages ranking only when necessary. The experimental results show that the proposed algorithm outperform in terms of the ranking’s execution times and the ranking’s accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Fagin, R., Kumar, R., Sivakumar, D.: Comparing top k lists. In: Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, pp. 28–36 (2003)
Garey, M.R., Johnson, D.S.: Computers and Intractability; A Guide to the Theory of NP-Completeness. W.H. Freeman & Co., New York (1990)
Leskovec, J., Krevl, A.: SNAP Datasets: Stanford large network dataset collection (2014). http://snap.stanford.edu/data
Montresor, A., Jelasity, M.: PeerSim: a scalable P2P simulator. In: Proceedings of the 9th International Conference on Peer-to-Peer (P2P 2009), pp. 99–100. Seattle (2009)
Parreira, J.X., Donato, D., Castillo, C., Weikum, G.: Computing trusted authority scores in peer-to-peer web search networks. In: Proceedings of the 3rd International Workshop on Adversarial Information Retrieval on the Web, AIRWeb 2007, pp. 73–80. ACM, New York (2007). https://doi.org/10.1145/1244408.1244422
Parreira, J.X., Weikum, G.: JXP global authority scores in a P2P network. In: Proceedings of the Eight International Work-shop on the Web and Databases (WebDB 2005), pp. 31–36. Baltimore (2005)
Sangamuang, S., Boonma, P., Natwichai, J.: A p2p-based incremental web ranking algorithm. In: Proceedings of the 2011 International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, pp. 123–127 (2011)
Sangamuang, S., Boonma, P., Natwichai, J.: An efficient algorithm for density-balanced partitioning in distributed pagerank. In: Proceedings of the 2014 9th International Conference on Digital Information Management, ICDIM 2014, pp. 118–123 (2014)
Sangamuang, S., Boonma, P., Natwichai, J.: An Algorithm for Min-Cut Density-Balanced Partitioning in P2P Web Ranking, pp. 257–266. Springer International Publishing, Cham (2015)
Sangamuang, S., Natwichai, J., Boonma, P.: Incremental web ranking on p2p networks. In: Proceedings of the 2011 3rd International Conference on Computer Research and Development, vol. 4, pp. 519–523 (2011)
Sankaralingam, K., Sethumadhavan, S., Browne, J.C.: Distributed pagerank for p2p systems. In: Proceedings of the 12th IEEE International Symposium on High Performance Distributed Computing, p. 58. IEEE Computer Society (2003)
Shi, S., Yu, J., Yang, G., Wang, D.: Distributed page ranking in structured p2p networks. In: Proceedings of the 2003 International Conference on Parallel Processing (2003)
Steinbauer, M., Anderst-Kotsis, G.: Dynamograph: a distributed system for large-scale, temporal graph processing, its implementation and first observations. In: Proceedings of the 25th International Conference Companion on World Wide Web, pp. 861–866 (2016)
Stoica, I., et al.: Chord: a scalable peer-to-peer lookup protocol for internet applications. IEEE/ACM Trans. Netw. 11(1), 17–32 (2003)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Sangamuang, S., Boonma, P., Natwichai, J. (2019). iDBP: A Distributed Min-Cut Density-Balanced Algorithm for Incremental Web-Pages Ranking. In: Xhafa, F., Leu, FY., Ficco, M., Yang, CT. (eds) Advances on P2P, Parallel, Grid, Cloud and Internet Computing. 3PGCIC 2018. Lecture Notes on Data Engineering and Communications Technologies, vol 24. Springer, Cham. https://doi.org/10.1007/978-3-030-02607-3_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-02607-3_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-02606-6
Online ISBN: 978-3-030-02607-3
eBook Packages: EngineeringEngineering (R0)