SOTAVerified

Benchmarking

Papers

Showing 16811690 of 5548 papers

TitleStatusHype
Changepoint Detection in Noisy Data Using a Novel Residuals Permutation-Based Method (RESPERM): Benchmarking and Application to Single Trial ERPsCode0
Benchmarking and optimizing organism wide single-cell RNA alignment methodsCode0
CEBench: A Benchmarking Toolkit for the Cost-Effectiveness of LLM PipelinesCode0
Benchmarking and Improving Compositional Generalization of Multi-aspect Controllable Text GenerationCode0
JALMBench: Benchmarking Jailbreak Vulnerabilities in Audio Language ModelsCode0
Knowledge Enhanced Conditional Imputation for Healthcare Time-seriesCode0
IOLBENCH: Benchmarking LLMs on Linguistic ReasoningCode0
IoT Data Trust Evaluation via Machine LearningCode0
InViG: Benchmarking Interactive Visual Grounding with 500K Human-Robot InteractionsCode0
A Benchmarking Study of Vision-based Robotic Grasping AlgorithmsCode0
Show:102550
← PrevPage 169 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified