SOTAVerified

Benchmarking

Papers

Showing 34013410 of 5548 papers

TitleStatusHype
MMMR: Benchmarking Massive Multi-Modal Reasoning Tasks0
MMSciBench: Benchmarking Language Models on Multimodal Scientific Problems0
MMSearch: Benchmarking the Potential of Large Models as Multi-modal Search Engines0
MobileAgentBench: An Efficient and User-Friendly Benchmark for Mobile LLM Agents0
MobileAIBench: Benchmarking LLMs and LMMs for On-Device Use Cases0
Model Agnostic Explainable Selective Regression via Uncertainty Estimation0
Model-based trajectory stitching for improved behavioural cloning and its applications0
Model-Based Underwater 6D Pose Estimation from RGB0
ModelHub.AI: Dissemination Platform for Deep Learning Models0
Model Lakes0
Show:102550
← PrevPage 341 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified