SOTAVerified

Zero-Shot Video Question Answer

This task present the results of Zeroshot Question Answer results on TGIF-QA dataset for LLM powered Video Conversational Models.

Papers

Showing 3140 of 85 papers

TitleStatusHype
ViperGPT: Visual Inference via Python Execution for ReasoningCode3
LinVT: Empower Your Image-level Large Language Model to Understand VideosCode2
PPLLaVA: Varied Video Sequence Understanding With Prompt GuidanceCode2
TimeSuite: Improving MLLMs for Long Video Understanding via Grounded TuningCode2
Needle In A Video Haystack: A Scalable Synthetic Evaluator for Video MLLMsCode2
VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long VideosCode2
An Image Grid Can Be Worth a Video: Zero-shot Video Question Answering Using a VLMCode2
Elysium: Exploring Object-level Perception in Videos via MLLMCode2
Understanding Long Videos with Multimodal Language ModelsCode2
vid-TLDR: Training Free Token merging for Light-weight Video TransformerCode2
Show:102550
← PrevPage 4 of 9Next →

No leaderboard results yet.