SOTAVerified

Benchmarking

Papers

Showing 111120 of 5548 papers

TitleStatusHype
Exploring Progress in Multivariate Time Series Forecasting: Comprehensive Benchmarking and Heterogeneity AnalysisCode3
A Unified Framework for Rank-based Evaluation Metrics for Link Prediction in Knowledge GraphsCode3
A Vision-Language Foundation Model to Enhance Efficiency of Chest X-ray InterpretationCode3
ChartGalaxy: A Dataset for Infographic Chart Understanding and GenerationCode3
Benchmarking and Improving Bird's Eye View Perception Robustness in Autonomous DrivingCode3
CleanRL: High-quality Single-file Implementations of Deep Reinforcement Learning AlgorithmsCode3
ComfyBench: Benchmarking LLM-based Agents in ComfyUI for Autonomously Designing Collaborative AI SystemsCode3
SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual ComprehensionCode3
IFEval-Audio: Benchmarking Instruction-Following Capability in Audio-based Large Language ModelsCode3
DrivAerNet++: A Large-Scale Multimodal Car Dataset with Computational Fluid Dynamics Simulations and Deep Learning BenchmarksCode3
Show:102550
← PrevPage 12 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified