SOTAVerified

Benchmarking

Papers

Showing 17111720 of 5548 papers

TitleStatusHype
A Unified Solution to Video Fusion: From Multi-Frame Learning to Benchmarking0
TDVE-Assessor: Benchmarking and Evaluating the Quality of Text-Driven Video Editing with LMMs0
AgentRecBench: Benchmarking LLM Agent-based Personalized Recommender Systems0
FinLoRA: Benchmarking LoRA Methods for Fine-Tuning LLMs on Financial Datasets0
Benchmarking Large Multimodal Models for Ophthalmic Visual Question Answering with OphthalWeChat0
SpokenNativQA: Multilingual Everyday Spoken Queries for LLMs0
Retrieval-Augmented Generation for Service Discovery: Chunking Strategies and Benchmarking0
EnvSDD: Benchmarking Environmental Sound Deepfake Detection0
DeepResearchGym: A Free, Transparent, and Reproducible Evaluation Sandbox for Deep Research0
AssistedDS: Benchmarking How External Domain Knowledge Assists LLMs in Automated Data Science0
Show:102550
← PrevPage 172 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified