Learning from Video and Text via Large-Scale Discriminative Clustering

2017-07-27ICCV 2017Code Available0· sign in to hype

Antoine Miech, Jean-Baptiste Alayrac, Piotr Bojanowski, Ivan Laptev, Josef Sivic

Code Available — Be the first to reproduce this paper.

Code

github.com/jpeyre/unrel
none★ 0
github.com/antoine77340/iccv17learning
none★ 0

Abstract

Discriminative clustering has been successfully applied to a number of weakly-supervised learning tasks. Such applications include person and action recognition, text-to-video alignment, object co-segmentation and colocalization in videos and images. One drawback of discriminative clustering, however, is its limited scalability. We address this issue and propose an online optimization algorithm based on the Block-Coordinate Frank-Wolfe algorithm. We apply the proposed method to the problem of weakly supervised learning of actions and actors from movies together with corresponding movie scripts. The scaling up of the learning problem to 66 feature length movies enables us to significantly improve weakly supervised action recognition.

Tasks

Action Recognition Clustering Temporal Action Localization Video Alignment Video Retrieval Weakly-Supervised Action Recognition Weakly-supervised Learning

Benchmark Results

Dataset	Model	Metric	Claimed	Verified	Status
LSMDC	Large-Scale Discriminative Clustering	text-to-video R@1	7.3	—	Unverified

Learning from Video and Text via Large-Scale Discriminative Clustering

Code

Abstract

Tasks

Benchmark Results

Reproductions