SOTAVerified

Zero-Shot Video Question Answer

This task present the results of Zeroshot Question Answer results on TGIF-QA dataset for LLM powered Video Conversational Models.

Papers

Showing 7180 of 85 papers

TitleStatusHype
Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video UnderstandingCode4
Self-Chained Image-Language Model for Video Localization and Question AnsweringCode1
VideoChat: Chat-Centric Video UnderstandingCode4
LLaMA-Adapter V2: Parameter-Efficient Visual Instruction ModelCode5
mPLUG-Owl: Modularization Empowers Large Language Models with MultimodalityCode4
Verbs in Action: Improving verb understanding in video-language modelsCode0
ViperGPT: Visual Inference via Python Execution for ReasoningCode3
VIPeR: Provably Efficient Algorithm for Offline RL with Neural Function ApproximationCode0
InternVideo: General Video Foundation Models via Generative and Discriminative LearningCode4
0/1 Deep Neural Networks via Block Coordinate Descent0
Show:102550
← PrevPage 8 of 9Next →

No leaderboard results yet.