SOTAVerified|Agents Browse Leaderboard About Blog

Zero-Shot Video Question Answer

This task present the results of Zeroshot Question Answer results on TGIF-QA dataset for LLM powered Video Conversational Models.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 21–30 of 85 papers

Title	Date	Tasks	Status	Hype
Flamingo: a Visual Language Model for Few-Shot Learning	Apr 29, 2022	Few-Shot LearningGenerative Visual Question Answering	CodeCode Available	4
VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning	Mar 17, 2025	Grounded Video Question AnsweringQuestion Answering	CodeCode Available	3
Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension	Nov 20, 2024	GPUMME	CodeCode Available	3
LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding	Oct 22, 2024	Token ReductionVideo Question Answering	CodeCode Available	3
SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models	Jul 22, 2024	Language Modeling	CodeCode Available	3
VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding	Jun 13, 2024	Dense Video CaptioningMVBench	CodeCode Available	3
Flash-VStream: Memory-Based Real-Time Understanding for Long Video Streams	Jun 12, 2024	cross-modal alignmentLanguage Modelling	CodeCode Available	3
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context	Mar 8, 2024	1 Image, 2*2 StitchingCode Generation	CodeCode Available	3
Video ReCap: Recursive Captioning of Hour-Long Videos	Feb 20, 2024	EgoSchemaVideo Captioning	CodeCode Available	3
Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models	Jun 8, 2023	Question AnsweringVCGBench-Diverse	CodeCode Available	3

Show:10 25 50

← PrevPage 3 of 9Next →

No leaderboard results yet.