SOTAVerified

Benchmarking

Papers

Showing 24012425 of 5548 papers

TitleStatusHype
Automated legal reasoning with discretion to act using s(LAW)0
Benchmarking the Robustness of Quantized Models0
Benchmarking the Robustness of Panoptic Segmentation for Automated Driving0
Automated Factual Benchmarking for In-Car Conversational Systems using Large Language Models0
A lightweight and accurate YOLO-like network for small target detection in Aerial Imagery0
A Baseline Method for Removing Invisible Image Watermarks using Deep Image Prior0
Benchmarking the Robustness of Instance Segmentation Models0
Automated detection of gibbon calls from passive acoustic monitoring data using convolutional neural networks in the "torch for R" ecosystem0
Genetic algorithm for feature selection of EEG heterogeneous data0
Galvatron: An Automatic Distributed System for Efficient Foundation Model Training0
Alibaba’s Submission for the WMT 2020 APE Shared Task: Improving Automatic Post-Editing with Pre-trained Conditional Cross-Lingual BERT0
Benchmarking the Reliability of Post-training Quantization: a Particular Focus on Worst-case Performance0
Benchmarking the rationality of AI decision making using the transitivity axiom0
Automated 3D Tumor Segmentation using Temporal Cubic PatchGAN (TCuP-GAN)0
Benchmarking the Physical-world Adversarial Robustness of Vehicle Detection0
AutoLay: Benchmarking amodal layout estimation for autonomous driving0
Benchmarking the Neural Linear Model for Regression0
Algorithm Selection with Probing Trajectories: Benchmarking the Choice of Classifier Model0
Benchmarking the Impact of Noise on Deep Learning-based Classification of Atrial Fibrillation in 12-Lead ECG0
Functional Code Building Genetic Programming0
Benchmarking the human brain against computational architectures0
A Conformance Checking-based Approach for Drift Detection in Business Processes0
FunBench: Benchmarking Fundus Reading Skills of MLLMs0
Efficient Pauli channel estimation with logarithmic quantum memory0
AutoAI-TS: AutoAI for Time Series Forecasting0
Show:102550
← PrevPage 97 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified