SOTAVerified

TGIF-Action

Papers

Showing 17 of 7 papers

TitleStatusHype
Lightweight Recurrent Cross-modal Encoder for Video Question AnsweringCode0
MELTR: Meta Loss Transformer for Learning to Fine-tune Video Foundation ModelsCode1
MuLTI: Efficient Video-and-Language Understanding with Text-Guided MultiWay-Sampler and Multiple Choice Modeling0
HiTeA: Hierarchical Temporal-Aware Video-Language Pre-training0
An Empirical Study of End-to-End Video-Language Transformers with Masked Visual ModelingCode1
Clover: Towards A Unified Video-Language Alignment and Fusion ModelCode1
All in One: Exploring Unified Video-Language Pre-trainingCode2
Show:102550

No leaderboard results yet.