Communication-Efficient Algorithms for Statistical Optimization

2012-12-01NeurIPS 2012Unverified0· sign in to hype

Yuchen Zhang, Martin J. Wainwright, John C. Duchi

Unverified — Be the first to reproduce this paper.

Abstract

We study two communication-efficient algorithms for distributed statistical optimization on large-scale data. The first algorithm is an averaging method that distributes the N data samples evenly to m machines, performs separate minimization on each subset, and then averages the estimates. We provide a sharp analysis of this average mixture algorithm, showing that under a reasonable set of conditions, the combined parameter achieves mean-squared error that decays as (N^-1+(N/m)^-2). Whenever m N, this guarantee matches the best possible rate achievable by a centralized algorithm having access to all N samples. The second algorithm is a novel method, based on an appropriate form of the bootstrap. Requiring only a single round of communication, it has mean-squared error that decays as (N^-1+(N/m)^-3), and so is more robust to the amount of parallelization. We complement our theoretical results with experiments on large-scale problems from the Microsoft Learning to Rank dataset.

Tasks

Learning-To-Rank

Communication-Efficient Algorithms for Statistical Optimization

Abstract

Tasks

Reproductions