skip to main content
10.1145/3010089.3010118acmotherconferencesArticle/Chapter ViewAbstractPublication PagesbdawConference Proceedingsconference-collections
research-article

A Parallel Data Mining Algorithm for PageRank Computation

Published: 10 November 2016 Publication History

Abstract

We study the utility of graphics processing units (GPUs) for an acceleration of the data mining PageRank algorithm and a reduction of the memory size of the web graph. We first present a new web graph representation using a compressed format in order to reduce the memory allocation of the web graph. Then, this web graph is simply partitioned into small chunks to be processed on the GPUs' device. The basic steps of the algorithm are then split up into parallel operations allowing to exploit the computing power of GPUs in the CUDA language as best as possible. In the experiments, we have tested the algorithm using GPUs with a set of real web data, and compared the computation with a CPU-based one. The obtained results show that the proposed PageRank computation on GPUs outperforms the CPU version by a factor of 100, reducing at the same time the web graph memory storage by 93, 928%.

References

[1]
Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. The pagerank citation ranking: bringing order to the web. 1999.
[2]
Bundit Manaskasemsak and Arnon Rungsawang. Parallel pagerank computation on a gigabit pc cluster. In Advanced Information Networking and Applications, 2004. AINA 2004. 18th International Conference on, volume 1, pages 273--277. IEEE, 2004.
[3]
Arnon Rungsawang and Bundit Manaskasemsak. Pagerank computation using pc cluster. In European Parallel Virtual Machine/Message Passing Interface UsersŠ Group Meeting, pages 152--159. Springer, 2003.
[4]
Bundit Manaskasemsak and Arnon Rungsawang. An efficient partition-based parallel pagerank algorithm. In 11th International Conference on Parallel and Distributed Systems (ICPADS'05), volume 1, pages 257--263. IEEE, 2005.
[5]
Nathan Bell and Michael Garland. Efficient sparse matrix-vector multiplication on cuda. Technical report, Nvidia Technical Report NVR-2008-004, Nvidia Corporation, 2008.
[6]
Tianji Wu, Bo Wang, Yi Shan, Feng Yan, Yu Wang, and Ningyi Xu. Efficient pagerank and spmv computation on amd gpus. In 2010 39th International Conference on Parallel Processing, pages 81--89. IEEE, 2010.
[7]
Xintian Yang, Srinivasan Parthasarathy, and Ponnuswamy Sadayappan. Fast sparse matrix-vector multiplication on gpus: implications for graph mining. Proceedings of the VLDB Endowment, 4(4):231--242, 2011.
[8]
Prasann Choudhari, Eikshith Baikampadi, Paresh Patil, and Sanket Gadekar. Parallel and improved pagerank algorithm for gpu-cpu collaborative environment. International Journal of Computer Science and Information Technologies, 6, 2015.
[9]
Ali Cevahir, Cevdet Aykanat, Ata Turk, B Barla Cambazoglu, Akira Nukada, and Satoshi Matsuoka. Efficient pagerank on gpu clusters. IPSJ SIG Notes, pages 1--6, 2010.
[10]
Arnon Rungsawang and Bundit Manaskasemsak. Fast pagerank computation on a gpu cluster. In 2012 20th Euromicro International Conference on Parallel, Distributed and Network-based Processing, pages 450--456. IEEE, 2012.
[11]
Nhat Tan Duong, Quang Anh Pham Nguyen, Anh Tu Nguyen, and Huu-Duc Nguyen. Parallel pagerank computation using gpus. In Proceedings of the Third Symposium on Information and Communication Technology, pages 223--230. ACM, 2012.
[12]
Alexander van Heukelum. Uf sparse matrix collection, institute for theoretical physics, utrecht university, http://www.cise.ufl.edu/research/sparse/matrices/vanheukelum/index.html.

Cited By

View all
  • (2023)Optimization of page rank algorithm using parallelization method2ND INTERNATIONAL CONFERENCE ON RECENT ADVANCES IN COMPUTATIONAL TECHNIQUES10.1063/5.0153296(020011)Online publication date: 2023

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
BDAW '16: Proceedings of the International Conference on Big Data and Advanced Wireless Technologies
November 2016
398 pages
ISBN:9781450347792
DOI:10.1145/3010089
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

  • ANR: Agence Nationale pour la Recherche
  • LABSTICC: Labsticc

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 November 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Big Data
  2. CUDA
  3. Data Mining
  4. PageRank
  5. Parallel Computation

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

BDAW '16

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Optimization of page rank algorithm using parallelization method2ND INTERNATIONAL CONFERENCE ON RECENT ADVANCES IN COMPUTATIONAL TECHNIQUES10.1063/5.0153296(020011)Online publication date: 2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media