SOTAVerified

Benchmarking

Papers

Showing 2130 of 5548 papers

TitleStatusHype
Hyperspectral Anomaly Detection Methods: A Survey and Comparative Study0
SenseShift6D: Multimodal RGB-D Benchmarking for Robust 6D Pose Estimation across Environment and Sensor VariationsCode0
Inaugural MOASEI Competition at AAMAS'2025: A Technical Report0
LLMThinkBench: Towards Basic Math Reasoning and Overthinking in Large Language ModelsCode1
GDGB: A Benchmark for Generative Dynamic Text-Attributed Graph LearningCode2
STRUCTSENSE: A Task-Agnostic Agentic Framework for Structured Information Extraction with Human-In-The-Loop Evaluation and BenchmarkingCode0
LANTERN: A Machine Learning Framework for Lipid Nanoparticle Transfection Efficiency PredictionCode0
CORE: Benchmarking LLMs Code Reasoning Capabilities through Static Analysis Tasks0
Latent Thermodynamic Flows: Unified Representation Learning and Generative Modeling of Temperature-Dependent Behaviors from Limited DataCode1
TransLaw: Benchmarking Large Language Models in Multi-Agent Simulation of the Collaborative Translation0
Show:102550
← PrevPage 3 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified