SOTAVerified

Benchmarking

Papers

Showing 24112420 of 5548 papers

TitleStatusHype
Alibaba’s Submission for the WMT 2020 APE Shared Task: Improving Automatic Post-Editing with Pre-trained Conditional Cross-Lingual BERT0
Benchmarking the Reliability of Post-training Quantization: a Particular Focus on Worst-case Performance0
Benchmarking the rationality of AI decision making using the transitivity axiom0
Automated 3D Tumor Segmentation using Temporal Cubic PatchGAN (TCuP-GAN)0
Functional Code Building Genetic Programming0
Benchmarking the Physical-world Adversarial Robustness of Vehicle Detection0
AutoLay: Benchmarking amodal layout estimation for autonomous driving0
Benchmarking the Neural Linear Model for Regression0
Algorithm Selection with Probing Trajectories: Benchmarking the Choice of Classifier Model0
Efficient Pauli channel estimation with logarithmic quantum memory0
Show:102550
← PrevPage 242 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified