SOTAVerified|Agents Browse Leaderboard About Blog

Zero-Shot Video Question Answer

This task present the results of Zeroshot Question Answer results on TGIF-QA dataset for LLM powered Video Conversational Models.

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–10 of 85 papers

Title	Date	Tasks	Status	Hype
MiniCPM-V: A GPT-4V Level MLLM on Your Phone	Aug 3, 2024	HallucinationMultiple-choice	CodeCode Available	12
Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution	Sep 18, 2024	Natural Language Visual Grounding	CodeCode Available	11
InternVideo2: Scaling Foundation Models for Multimodal Video Understanding	Mar 22, 2024	Action ClassificationAction Recognition	CodeCode Available	7
Qwen2.5-Omni Technical Report	Mar 26, 2025	Automatic Speech Recognition (ASR)GSM8K	CodeCode Available	7
LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models	Jul 10, 2024	Video Question AnsweringZero-Shot Video Question Answer	CodeCode Available	7
Mistral 7B	Oct 10, 2023	answerability predictionArithmetic Reasoning	CodeCode Available	6
LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model	Apr 28, 2023	Instruction Followingmodel	CodeCode Available	5
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs	Jun 11, 2024	Multiple-choiceQuestion Answering	CodeCode Available	5
MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with Interleaved Visual-Textual Tokens	Apr 4, 2024	Language ModelingLanguage Modelling	CodeCode Available	4
Flamingo: a Visual Language Model for Few-Shot Learning	Apr 29, 2022	Few-Shot LearningGenerative Visual Question Answering	CodeCode Available	4

Show:10 25 50

← PrevPage 1 of 9Next →

No leaderboard results yet.