SOTAVerified

Benchmarking

Papers

Showing 11511175 of 5548 papers

TitleStatusHype
Deep Learning-Based Synchronization for Uplink NB-IoTCode1
Oracle-MNIST: a Realistic Image Dataset for Benchmarking Machine Learning AlgorithmsCode1
The VoicePrivacy 2020 Challenge Evaluation PlanCode1
Federated Learning Under Intermittent Client Availability and Time-Varying Communication ConstraintsCode1
Structured, flexible, and robust: benchmarking and improving large language models towards more human-like behavior in out-of-distribution reasoning tasksCode1
Clinical Prompt Learning with Frozen Language ModelsCode1
GenISP: Neural ISP for Low-Light Machine CognitionCode1
BiCo-Net: Regress Globally, Match Locally for Robust 6D Pose EstimationCode1
Benchmarking Econometric and Machine Learning Methodologies in NowcastingCode1
Creating a Forensic Database of Shoeprints from Online Shoe Tread PhotosCode1
Continual Learning with Foundation Models: An Empirical Study of Latent ReplayCode1
A global analysis of metrics used for measuring performance in natural language processingCode1
NICO++: Towards Better Benchmarking for Domain GeneralizationCode1
Stress-Testing Point Cloud Registration on Automotive LiDARCode1
Deep learning model solves change point detection for multiple change typesCode1
Do You Really Mean That? Content Driven Audio-Visual Deepfake Dataset and Multimodal Method for Temporal Forgery LocalizationCode1
Data Splits and Metrics for Method Benchmarking on Surgical Action Triplet DatasetsCode1
BioRED: A Rich Biomedical Relation Extraction DatasetCode1
The Moral Integrity Corpus: A Benchmark for Ethical Dialogue SystemsCode1
Dynatask: A Framework for Creating Dynamic AI Benchmark TasksCode1
Coarse-to-Fine Q-attention with Learned Path RankingCode1
Earnings-22: A Practical Benchmark for Accents in the WildCode1
Parameter-efficient Model Adaptation for Vision TransformersCode1
Visual Abductive ReasoningCode1
Fantastic Questions and Where to Find Them: FairytaleQA -- An Authentic Dataset for Narrative ComprehensionCode1
Show:102550
← PrevPage 47 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified