SOTAVerified

Streaming Pointwise Mutual Information

2009-12-01NeurIPS 2009Unverified0· sign in to hype

Benjamin V. Durme, Ashwin Lall

Unverified — Be the first to reproduce this paper.

Reproduce

Abstract

Recent work has led to the ability to perform space efficient, approximate counting over large vocabularies in a streaming context. Motivated by the existence of data structures of this type, we explore the computation of associativity scores, other- wise known as pointwise mutual information (PMI), in a streaming context. We give theoretical bounds showing the impracticality of perfect online PMI compu- tation, and detail an algorithm with high expected accuracy. Experiments on news articles show our approach gives high accuracy on real world data.

Tasks

Reproductions