SOTAVerified

Benchmarking

Papers

Showing 51515200 of 5548 papers

TitleStatusHype
Dynamic Risk Assessment Methodology with an LDM-based System for Parking Scenarios0
A Synthetic Benchmarking Pipeline to Compare Camera Calibration Algorithms0
DynamicVL: Benchmarking Multimodal Large Language Models for Dynamic City Understanding0
Almost Equivariance via Lie Algebra Convolutions0
COCO: The Experimental Procedure0
Benchmarking In-the-wild Multimodal Disease Recognition and A Versatile Baseline0
VisionKG: Unleashing the Power of Visual Datasets via Knowledge Graph0
E2E Parking Dataset: An Open Benchmark for End-to-End Autonomous Parking0
A biologically-inspired multi-modal evaluation of molecular generative machine learning0
CMOS based image cytometry for detection of phytoplankton in ballast water0
EarthquakeNPP: Benchmark Datasets for Earthquake Forecasting with Neural Point Processes0
EASTER: Efficient and Scalable Text Recognizer0
CMAWRNet: Multiple Adverse Weather Removal via a Unified Quaternion Neural Architecture0
CloudifierNet -- Deep Vision Models for Artificial Image Processing0
Vision Learners Meet Web Image-Text Pairs0
CLLMate: A Multimodal Benchmark for Weather and Climate Events Forecasting0
ECG-Adv-GAN: Detecting ECG Adversarial Examples with Conditional Generative Adversarial Networks0
Synthetic Video Generation for Robust Hand Gesture Recognition in Augmented Reality Applications0
ECKGBench: Benchmarking Large Language Models in E-commerce Leveraging Knowledge Graph0
EconGym: A Scalable AI Testbed with Diverse Economic Tasks0
EconWebArena: Benchmarking Autonomous Agents on Economic Tasks in Realistic Web Environments0
CLIRudit: Cross-Lingual Information Retrieval of Scientific Documents0
clem:todd: A Framework for the Systematic Benchmarking of LLM-Based Task-Oriented Dialogue System Realisations0
Edge-Cloud Collaborative Computing on Distributed Intelligence and Model Optimization: A Survey0
Edge-First Language Model Inference: Models, Metrics, and Tradeoffs0
EdgeMark: An Automation and Benchmarking System for Embedded Artificial Intelligence Tools0
Synthetic weather radar using hybrid quantum-classical machine learning0
EditVal: Benchmarking Diffusion Based Text-Guided Image Editing Methods0
Classifying neuromorphic data using a deep learning framework for image classification0
EEGS: A Transparent Model of Emotions0
EffCNet: An Efficient CondenseNet for Image Classification on NXP BlueBox0
Scaling laws in global corporations as a benchmarking approach to assess environmental performance0
Effective Evaluation of Deep Active Learning on Image Classification Tasks0
Effective Transfer of Pretrained Large Visual Model for Fabric Defect Segmentation via Specifc Knowledge Injection0
Classification of the Fashion-MNIST Dataset on a Quantum Computer0
Efficacy of Synthetic Data as a Benchmark0
Efficiency in European Air Traffic Management -- A Fundamental Analysis of Data, Models, and Methods0
Efficient computation of backprojection arrays for 3D light field deconvolution0
Efficient and Accurate In-Database Machine Learning with SQL Code Generation in Python0
A Benchmarking Protocol for SAR Colorization: From Regression to Deep Learning Approaches0
SynthRAD2025 Grand Challenge dataset: generating synthetic CTs for radiotherapy0
Efficient Benchmarking of Algorithm Configuration Procedures via Model-Based Surrogates0
Efficient Benchmarking of Language Models0
Efficient Benchmarking of NLP APIs using Multi-armed Bandits0
Efficient but Vulnerable: Benchmarking and Defending LLM Batch Prompting Attack0
Efficient Channel Estimation for Millimeter Wave and Terahertz Systems Enabled by Integrated Super-resolution Sensing and Communication0
Classification and Retrieval of Digital Pathology Scans: A New Dataset0
Efficient Exploration of Image Classifier Failures with Bayesian Optimization and Text-to-Image Models0
Efficient Expression Neutrality Estimation with Application to Face Recognition Utility Prediction0
Efficiently Exploring Ordering Problems through Conflict-directed Search0
Show:102550
← PrevPage 104 of 111Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified