SOTAVerified

Benchmarking

Papers

Showing 18261850 of 5548 papers

TitleStatusHype
ChatGPT Alternative Solutions: Large Language Models Survey0
Comprehensive Energy Footprint Benchmarking Algorithm for Electrified Powertrains0
Comprehensive Energy Footprint Benchmarking of Strong Parallel Electrified Powertrain0
Comprehensive Review and Empirical Evaluation of Causal Discovery Algorithms for Numerical Data0
Computational and Exploratory Landscape Analysis of the GKLS Generator0
An Empirical Study of Automated Mislabel Detection in Real World Vision Datasets0
Chart-to-Experience: Benchmarking Multimodal LLMs for Predicting Experiential Impact of Charts0
Computer-aided diagnosis and prediction in brain disorders0
Computer Vision for Autonomous Vehicles: Problems, Datasets and State of the Art0
DRIV100: In-The-Wild Multi-Domain Dataset and Evaluation for Real-World Domain Adaptation of Semantic Segmentation0
ConDefects: A New Dataset to Address the Data Leakage Concern for LLM-based Fault Localization and Program Repair0
A War Beyond Deepfake: Benchmarking Facial Counterfeits and Countermeasures0
Conditionally Invariant Representation Learning for Disentangling Cellular Heterogeneity0
Conditional Neural Processes for Molecules0
Benchmarking Decoupled Neural Interfaces with Synthetic Gradients0
CoNES: Convex Natural Evolutionary Strategies0
Confident or Seek Stronger: Exploring Uncertainty-Based On-device LLM Routing From Benchmarking to Generalization0
Configurable 3D Scene Synthesis and 2D Image Rendering with Per-Pixel Ground Truth using Stochastic Grammars0
Configurable Embodied Data Generation for Class-Agnostic RGB-D Video Segmentation0
Dual Task Framework for Improving Persona-grounded Dialogue Dataset0
Connecting convex energy-based inference and optimal transport for domain adaptation0
Dynamic benchmarking framework for LLM-based conversational data capture0
CHaRNet: Conditioned Heatmap Regression for Robust Dental Landmark Localization0
Benchmarking deep generative models for diverse antibody sequence design0
Characterizing Transactional Databases for Frequent Itemset Mining0
Show:102550
← PrevPage 74 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified