SOTAVerified

Improved Communication Cost in Distributed PageRank Computation – A Theoretical Study

2020-01-01ICML 2020Unverified0· sign in to hype

Siqiang Luo

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

PageRank is a widely used approach for measuring the importance of a node in a graph. Computing PageRank is a fundamental task in numerous applications including web search, machine learning and recommendation systems. The importance of computing PageRanks in a distributed environment has been recognized due to the rapid growth of the graph size in real world. However, only a few previous works can provide a provable complexity and accuracy for distributed PageRank computation. Given a constant d>0 and a graph of n nodes and under the well-known congested-clique distributed model, the state-of-the-art approach, Radar-Push, uses O(n+d) communication rounds to approximate the PageRanks within a relative error O(1^dn). However, Radar-Push entails as large as O(^2d+3n) bits of bandwidth (e.g., the communication cost between a pair of nodes per round) in the worst case. In this paper, we provide a new algorithm that uses asymptotically the same communication rounds while significantly improves the bandwidth from O(^2d+3n) bits to O(d^3n) bits. To the best of our knowledge, our distributed PageRank algorithm is the first to achieve o(dn) communication rounds with O(d^3n) bits of bandwidth in approximating PageRanks with relative error O(1^dn) under the congested-clique model.

Tasks

Reproductions