SOTAVerified

Benchmarking

Papers

Showing 29312940 of 5548 papers

TitleStatusHype
Enhancing Biomedical Knowledge Discovery for Diseases: An Open-Source Framework Applied on Rett Syndrome and Alzheimer's DiseaseCode0
Phi-3 Safety Post-Training: Aligning Language Models with a "Break-Fix" Cycle0
RT-Pose: A 4D Radar Tensor-based 3D Human Pose Estimation and Localization Benchmark0
Language-Driven 6-DoF Grasp Detection Using Negative Prompt Guidance0
Benchmarking Robust Self-Supervised Learning Across Diverse Downstream TasksCode0
FETCH: A Memory-Efficient Replay Approach for Continual Learning in Image Classification0
Is Sarcasm Detection A Step-by-Step Reasoning Process in Large Language Models?0
LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models0
Abstraction Alignment: Comparing Model-Learned and Human-Encoded Conceptual RelationshipsCode0
HIMO: A New Benchmark for Full-Body Human Interacting with Multiple Objects0
Show:102550
← PrevPage 294 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified