SOTAVerified

Zero-Shot Video Question Answer

This task present the results of Zeroshot Question Answer results on TGIF-QA dataset for LLM powered Video Conversational Models.

Papers

Showing 4150 of 85 papers

TitleStatusHype
PPLLaVA: Varied Video Sequence Understanding With Prompt GuidanceCode2
vid-TLDR: Training Free Token merging for Light-weight Video TransformerCode2
VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long VideosCode2
Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video UnderstandingCode2
TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video UnderstandingCode2
TimeSuite: Improving MLLMs for Long Video Understanding via Grounded TuningCode2
Understanding Long Videos with Multimodal Language ModelsCode2
Valley: Video Assistant with Large Language model Enhanced abilitYCode2
Language Repository for Long Video UnderstandingCode1
Shot2Story20K: A New Benchmark for Comprehensive Understanding of Multi-shot VideosCode1
Show:102550
← PrevPage 5 of 9Next →

No leaderboard results yet.