SOTAVerified

Benchmarking

Papers

Showing 32413250 of 5548 papers

TitleStatusHype
Label-Efficient Point Cloud Semantic Segmentation: An Active Learning Approach0
Benchmarking Open-ended Audio Dialogue Understanding for Large Audio-Language Models0
AI Cyber Risk Benchmark: Automated Exploitation Capabilities0
λ: A Benchmark for Data-Efficiency in Long-Horizon Indoor Mobile Manipulation Robotics0
LabSafety Bench: Benchmarking LLMs on Safety Issues in Scientific Labs0
Time and Tokens: Benchmarking End-to-End Speech Dysfluency Detection0
LAG-MMLU: Benchmarking Frontier LLM Understanding in Latvian and Giriama0
Benchmarking Online Sequence-to-Sequence and Character-based Handwriting Recognition from IMU-Enhanced Pens0
Time Awareness in Large Language Models: Benchmarking Fact Recall Across Time0
Benchmarking Online Object Trackers for Underwater Robot Position Locking Applications0
Show:102550
← PrevPage 325 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified