SOTAVerified|Agents Browse Leaderboard About Blog

Zero-Shot Video Question Answer

This task present the results of Zeroshot Question Answer results on TGIF-QA dataset for LLM powered Video Conversational Models.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 21–30 of 85 papers

Title	Date	Tasks	Status	Hype
SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models	Jul 22, 2024	Language Modeling	CodeCode Available	3
LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models	Jul 10, 2024	Video Question AnsweringZero-Shot Video Question Answer	CodeCode Available	7
Tarsier: Recipes for Training and Evaluating Large Video Description Models	Jun 30, 2024	Video CaptioningVideo Description	CodeCode Available	4
Long Context Transfer from Language to Vision	Jun 24, 2024	Language ModelingLanguage Modelling	CodeCode Available	4
GPT-4o: Visual perception performance of multimodal large language models in piglet activity understanding	Jun 14, 2024	Activity RecognitionMMR total	—Unverified	0
Long Story Short: Story-level Video Understanding from 20K Short Films	Jun 14, 2024	Multiple Choice Question Answering (MCQA)Open-Ended Question Answering	—Unverified	0
Needle In A Video Haystack: A Scalable Synthetic Evaluator for Video MLLMs	Jun 13, 2024	BenchmarkingQuestion Answering	CodeCode Available	2
VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding	Jun 13, 2024	Dense Video CaptioningMVBench	CodeCode Available	3
Too Many Frames, Not All Useful: Efficient Strategies for Long-Form Video QA	Jun 13, 2024	AllEgoSchema	CodeCode Available	1
Flash-VStream: Memory-Based Real-Time Understanding for Long Video Streams	Jun 12, 2024	cross-modal alignmentLanguage Modelling	CodeCode Available	3

Show:10 25 50

← PrevPage 3 of 9Next →

No leaderboard results yet.