SOTAVerified

Benchmarking

Papers

Showing 29812990 of 5548 papers

TitleStatusHype
Anchor Points: Benchmarking Models with Much Fewer ExamplesCode0
M3Dsynth: A dataset of medical 3D images with AI-generated local manipulationsCode0
Leveraging Contextual Information for Effective Entity Salience Detection0
Benchmarking machine learning models for quantum state classification0
VerilogEval: Evaluating Large Language Models for Verilog Code GenerationCode2
So you think you can track?0
Benchmarking Procedural Language Understanding for Low-Resource Languages: A Case Study on TurkishCode0
An Image Dataset for Benchmarking Recommender Systems with Raw PixelsCode1
AmodalSynthDrive: A Synthetic Amodal Perception Dataset for Autonomous Driving0
Unveiling the potential of large language models in generating semantic and cross-language clones0
Show:102550
← PrevPage 299 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified