SOTAVerified

Zero-Shot Video Question Answer

This task present the results of Zeroshot Question Answer results on TGIF-QA dataset for LLM powered Video Conversational Models.

Papers

Showing 5160 of 85 papers

TitleStatusHype
A Simple LLM Framework for Long-Range Video Question-AnsweringCode1
TS-LLaVA: Constructing Visual Tokens through Thumbnail-and-Sampling for Training-Free Video Large Language ModelsCode1
Agentic Keyframe Search for Video Question AnsweringCode1
EgoSchema: A Diagnostic Benchmark for Very Long-form Video Language UnderstandingCode1
Zero-Shot Video Question Answering via Frozen Bidirectional Language ModelsCode1
BT-Adapter: Video Conversation is Feasible Without Video Instruction TuningCode1
VideoMultiAgents: A Multi-Agent Framework for Video Question AnsweringCode1
OmniDataComposer: A Unified Data Structure for Multimodal Data Fusion and Infinite Data GenerationCode1
Too Many Frames, Not All Useful: Efficient Strategies for Long-Form Video QACode1
TraveLER: A Modular Multi-LMM Agent Framework for Video Question-AnsweringCode1
Show:102550
← PrevPage 6 of 9Next →

No leaderboard results yet.