SOTAVerified

Benchmarking

Papers

Showing 22262250 of 5548 papers

TitleStatusHype
BioDSA-1K: Benchmarking Data Science Agents for Biomedical Research0
A Multimodal, Full-Surround Vehicular Testbed for Naturalistic Studies and Benchmarking: Design, Calibration and Deployment0
Binary Classification with Positive Labeling Sources0
Benanza: Automatic μBenchmark Generation to Compute "Lower-bound" Latency and Inform Optimizations of Deep Learning Models on GPUs0
Featuremetric benchmarking: Quantum computer benchmarks based on circuit features0
A Multi-Labeled Dataset for Indonesian Discourse: Examining Toxicity, Polarization, and Demographics Information0
BigDataBench: A Scalable and Unified Big Data and AI Benchmark Suite0
BelHouse3D: A Benchmark Dataset for Assessing Occlusion Robustness in 3D Point Cloud Semantic Segmentation0
Feature Encodings for Gradient Boosting with Automunge0
Feature Selection and Classification of Hyperspectral Images With Support Vector Machines0
Bi-Discriminator Class-Conditional Tabular GAN0
Bi-DCSpell: A Bi-directional Detector-Corrector Interactive Framework for Chinese Spelling Check0
Behavior Structformer: Learning Players Representations with Structured Tokenization0
BEHAVIOR in Habitat 2.0: Simulator-Independent Logical Task Description for Benchmarking Embodied AI Agents0
BIAS: Transparent reporting of biomedical image analysis challenges0
AM-RADIO: Agglomerative Vision Foundation Model Reduce All Domains Into One0
A Benchmark for Out of Distribution Detection in Point Cloud 3D Semantic Segmentation0
Feature-based Evolutionary Diversity Optimization of Discriminating Instances for Chance-constrained Optimization Problems0
Feature selection in linear SVMs via a hard cardinality constraint: a scalable SDP decomposition approach0
Bias Mitigation for Machine Learning Classifiers: A Comprehensive Survey0
Beyond Visual Understanding: Introducing PARROT-360V for Vision Language Model Benchmarking0
Beyond Uniform Lipschitz Condition in Differentially Private Optimization0
FAVOR-Bench: A Comprehensive Benchmark for Fine-Grained Video Motion Understanding0
Beyond the Singular: The Essential Role of Multiple Generations in Effective Benchmark Evaluation and Analysis0
Beyond the Hype: Benchmarking LLM-Evolved Heuristics for Bin Packing0
Show:102550
← PrevPage 90 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified