SOTAVerified

Video Retrieval

The objective of video retrieval is as follows: given a text query and a pool of candidate videos, select the video which corresponds to the text query. Typically, the videos are returned as a ranked list of candidates and scored via document retrieval metrics.

Papers

Showing 276300 of 486 papers

TitleStatusHype
Clarification of Video Retrieval Query Results by the Automated Insertion of Supporting Shots0
Classroom Video Assessment and Retrieval via Multiple Instance Learning0
CLIP2TV: Align, Match and Distill for Video-Text Retrieval0
CLOP: Video-and-Language Pre-Training with Knowledge Regularizations0
CMAWRNet: Multiple Adverse Weather Removal via a Unified Quaternion Neural Architecture0
CNN Retrieval based Unsupervised Metric Learning for Near-Duplicated Video Retrieval0
Coarse to Fine: Video Retrieval before Moment Localization0
CoAVT: A Cognition-Inspired Unified Audio-Visual-Text Pre-Training Model for Multimodal Processing0
Colo-SCRL: Self-Supervised Contrastive Representation Learning for Colonoscopic Video Retrieval0
Contrastive Video-Language Learning with Fine-grained Frame Sampling0
Controllable Augmentations for Video Representation Learning0
COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval0
CREATE: A Benchmark for Chinese Short Video Retrieval and Title Generation0
CREATE: A Benchmark for Chinese Short Video Retrieval and Title Generation0
CUPID: Adaptive Curation of Pre-training Data for Video-and-Language Representation Learning0
Deep Heterogeneous Hashing for Face Video Retrieval0
Deep Learning Based Semantic Video Indexing and Retrieval0
De-Hashing: Server-Side Context-Aware Feature Reconstruction for Mobile Visual Search0
Detours for Navigating Instructional Videos0
Discrete Wavelet Transform and Gradient Difference based approach for text localization in videos0
Distilling Vision-Language Models on Millions of Videos0
Domain Adaptation in Multi-View Embedding for Cross-Modal Video Retrieval0
Dual-Stream Knowledge-Preserving Hashing for Unsupervised Video Retrieval0
EA-VTR: Event-Aware Video-Text Retrieval0
Efficient Action Detection in Untrimmed Videos via Multi-Task Learning0
Show:102550
← PrevPage 12 of 20Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1OmniVectext-to-video R@1089.4Unverified
2CLIP4Cliptext-to-video R@1081.6Unverified
3OmniVec (pretrained)text-to-video R@1078.6Unverified
4HunYuan_tvr (huge)text-to-video R@162.9Unverified
5CLIP-ViPtext-to-video R@157.7Unverified
6PIDRotext-to-video R@155.9Unverified
7DMAE (ViT-B/16)text-to-video R@155.5Unverified
8HunYuan_tvrtext-to-video R@155Unverified
9MuLTItext-to-video R@154.7Unverified
10EERCFtext-to-video R@154.1Unverified
#ModelMetricClaimedVerifiedStatus
1Aurora (ours, r=64)text-to-video R@577.4Unverified
2InternVideo2-6Btext-to-video R@174.2Unverified
3vid-TLDR (UMT-L)text-to-video R@172.3Unverified
4VASTtext-to-video R@172Unverified
5COSAtext-to-video R@170.5Unverified
6UMT-L (ViT-L/16)text-to-video R@170.4Unverified
7GRAMtext-to-video R@167.3Unverified
8VALORtext-to-video R@161.5Unverified
9TESTA (ViT-B/16)text-to-video R@161.2Unverified
10VindLUtext-to-video R@161.2Unverified
#ModelMetricClaimedVerifiedStatus
1GRAMtext-to-video R@164Unverified
2VASTtext-to-video R@163.9Unverified
3InternVideo2-6Btext-to-video R@162.8Unverified
4VALORtext-to-video R@159.9Unverified
5UMT-L (ViT-L/16)text-to-video R@158.8Unverified
6vid-TLDR (UMT-L)text-to-video R@158.1Unverified
7COSAtext-to-video R@157.9Unverified
8InternVideo2-6Btext-to-video R@155.9Unverified
9InternVideotext-to-video R@155.2Unverified
10VLABtext-to-video R@155.1Unverified
#ModelMetricClaimedVerifiedStatus
1EMCL-Net (Ours)++ LSMDC Rohrbach et al. (2015)text-to-video R@1053.7Unverified
2InternVideo2-6Btext-to-video R@146.4Unverified
3vid-TLDR (UMT-L)text-to-video R@143.1Unverified
4UMT-L (ViT-L/16)text-to-video R@143Unverified
5HunYuan_tvr (huge)text-to-video R@140.4Unverified
6COSAtext-to-video R@139.4Unverified
7mPLUG-2text-to-video R@134.4Unverified
8VALORtext-to-video R@134.2Unverified
9InternVideotext-to-video R@134Unverified
10InternVideo2-6Btext-to-video R@133.8Unverified