SOTAVerified

Benchmarking

Papers

Showing 28012825 of 5548 papers

TitleStatusHype
DailyQA: A Benchmark to Evaluate Web Retrieval Augmented LLMs Based on Capturing Real-World Changes0
Danish Airs and Grounds: A Dataset for Aerial-to-Street-Level Place Recognition and Localization0
DarkBench: Benchmarking Dark Patterns in Large Language Models0
DASB -- Discrete Audio and Speech Benchmark0
Data Analysis in the Era of Generative AI0
Data and its (dis)contents: A survey of dataset development and use in machine learning research0
Data Augmentation for Continual RL via Adversarial Gradient Episodic Memory0
Data Augmentation for Traffic Classification0
Data Collection of Real-Life Knowledge Work in Context: The RLKWiC Dataset0
Data-driven Approach for Static Hedging of Exchange Traded Options0
Data-driven inventory management for new products: An adjusted Dyna-Q approach with transfer learning0
Data-driven Power Flow Linearization: Simulation0
Data-driven surrogate modelling and benchmarking for process equipment0
Data-Driven Target Localization: Benchmarking Gradient Descent Using the Cramer-Rao Bound0
Data needs and challenges for quantum dot devices automation0
Multi-scale data reconstruction of turbulent rotating flows with Gappy POD, Extended POD and Generative Adversarial Networks0
Dataset and Benchmarking of Real-Time Embedded Object Detection for RoboCup SSL0
DB3V: A Dialect Dominated Dataset of Bird Vocalisation for Cross-corpus Bird Species Recognition0
DBsurf: A Discrepancy Based Method for Discrete Stochastic Gradient Estimation0
DDR-ID: Dual Deep Reconstruction Networks Based Image Decomposition for Anomaly Detection0
DeAR: Debiasing Vision-Language Models with Additive Residuals0
DECASTE: Unveiling Caste Stereotypes in Large Language Models through Multi-Dimensional Bias Analysis0
Decentralized Federated Learning on the Edge over Wireless Mesh Networks0
Decentralized Joint Beamforming, User Scheduling and QoS Management in 5G and Beyond Systems0
Decentralized Learning for Overparameterized Problems: A Multi-Agent Kernel Approximation Approach0
Show:102550
← PrevPage 113 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified