SOTAVerified

Benchmarking

Papers

Showing 10411050 of 5548 papers

TitleStatusHype
Benchmarking Pathology Feature Extractors for Whole Slide Image ClassificationCode1
CloudEval-YAML: A Practical Benchmark for Cloud Configuration GenerationCode1
CO-Bench: Benchmarking Language Model Agents in Algorithm Search for Combinatorial OptimizationCode1
CodeIF: Benchmarking the Instruction-Following Capabilities of Large Language Models for Code GenerationCode1
AsEP: Benchmarking Deep Learning Methods for Antibody-specific Epitope PredictionCode1
A Global Benchmark of Algorithms for Segmenting Late Gadolinium-Enhanced Cardiac Magnetic Resonance ImagingCode1
A Scale-Invariant Sorting Criterion to Find a Causal Order in Additive Noise ModelsCode1
A global analysis of metrics used for measuring performance in natural language processingCode1
Chakra: Advancing Performance Benchmarking and Co-design using Standardized Execution TracesCode1
Clinical Prompt Learning with Frozen Language ModelsCode1
Show:102550
← PrevPage 105 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified