SOTAVerified

Benchmarking

Papers

Showing 511520 of 5548 papers

TitleStatusHype
Benchmarking Embedding Aggregation Methods in Computational Pathology: A Clinical Data PerspectiveCode1
Collective Knowledge: organizing research projects as a database of reusable components and portable workflows with common APIsCode1
CodeS: Natural Language to Code Repository via Multi-Layer SketchCode1
An Evaluation Dataset for Intent Classification and Out-of-Scope PredictionCode1
Benchmarking Encoder-Decoder Architectures for Biplanar X-ray to 3D Shape ReconstructionCode1
CodeUpdateArena: Benchmarking Knowledge Editing on API UpdatesCode1
CombiBench: Benchmarking LLM Capability for Combinatorial MathematicsCode1
Contemporary Symbolic Regression Methods and their Relative PerformanceCode1
CODEBench: A Neural Architecture and Hardware Accelerator Co-Design FrameworkCode1
Codabench: Flexible, Easy-to-Use and Reproducible Benchmarking PlatformCode1
Show:102550
← PrevPage 52 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified