On Semantic Similarity in Video Retrieval

2021-03-18CVPR 2021Code Available1· sign in to hype

Michael Wray, Hazel Doughty, Dima Damen

Code Available — Be the first to reproduce this paper.

Code

github.com/mwray/Semantic-Video-Retrieval
OfficialIn papernone★ 53
github.com/aranciokov/fsmmda_videoretrieval
pytorch★ 10
github.com/aranciokov/ranp
pytorch★ 5

Abstract

Current video retrieval efforts all found their evaluation on an instance-based assumption, that only a single caption is relevant to a query video and vice versa. We demonstrate that this assumption results in performance comparisons often not indicative of models' retrieval capabilities. We propose a move to semantic similarity video retrieval, where (i) multiple videos/captions can be deemed equally relevant, and their relative ranking does not affect a method's reported performance and (ii) retrieved videos/captions are ranked by their similarity to a query. We propose several proxies to estimate semantic similarities in large-scale retrieval datasets, without additional annotations. Our analysis is performed on three commonly used video retrieval datasets (MSR-VTT, YouCook2 and EPIC-KITCHENS).

Tasks

Retrieval Semantic Similarity Semantic Textual Similarity Video Retrieval

On Semantic Similarity in Video Retrieval

Code

Abstract

Tasks

Reproductions