SOTAVerified

Benchmarking

Papers

Showing 29512975 of 5548 papers

TitleStatusHype
Dynamic Intelligence Assessment: Benchmarking LLMs on the Road to AGI with a Focus on Model Confidence0
Dynamic-KGQA: A Scalable Framework for Generating Adaptive Question Answering Datasets0
Dynamic Obstacle Avoidance with Bounded Rationality Adversarial Reinforcement Learning0
Dynamic PDB: A New Dataset and a SE(3) Model Extension by Integrating Dynamic Behaviors and Physical Properties in Protein Structures0
Dynamic Risk Assessment Methodology with an LDM-based System for Parking Scenarios0
DynamicVL: Benchmarking Multimodal Large Language Models for Dynamic City Understanding0
E2E Parking Dataset: An Open Benchmark for End-to-End Autonomous Parking0
EarthquakeNPP: Benchmark Datasets for Earthquake Forecasting with Neural Point Processes0
EASTER: Efficient and Scalable Text Recognizer0
ECG-Adv-GAN: Detecting ECG Adversarial Examples with Conditional Generative Adversarial Networks0
ECKGBench: Benchmarking Large Language Models in E-commerce Leveraging Knowledge Graph0
EconGym: A Scalable AI Testbed with Diverse Economic Tasks0
EconWebArena: Benchmarking Autonomous Agents on Economic Tasks in Realistic Web Environments0
Edge-Cloud Collaborative Computing on Distributed Intelligence and Model Optimization: A Survey0
Edge-First Language Model Inference: Models, Metrics, and Tradeoffs0
EdgeMark: An Automation and Benchmarking System for Embedded Artificial Intelligence Tools0
EditVal: Benchmarking Diffusion Based Text-Guided Image Editing Methods0
EEGS: A Transparent Model of Emotions0
EffCNet: An Efficient CondenseNet for Image Classification on NXP BlueBox0
Effective Evaluation of Deep Active Learning on Image Classification Tasks0
Effective Transfer of Pretrained Large Visual Model for Fabric Defect Segmentation via Specifc Knowledge Injection0
Efficacy of Synthetic Data as a Benchmark0
Efficiency in European Air Traffic Management -- A Fundamental Analysis of Data, Models, and Methods0
Efficient computation of backprojection arrays for 3D light field deconvolution0
Efficient and Accurate In-Database Machine Learning with SQL Code Generation in Python0
Show:102550
← PrevPage 119 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified