SOTAVerified

Benchmarking

Papers

Showing 22012250 of 5548 papers

TitleStatusHype
Few-Shot Defect Segmentation Leveraging Abundant Normal Training Samples Through Normal Background Regularization and Crop-and-Paste Operation0
ForamViT-GAN: Exploring New Paradigms in Deep Learning for Micropaleontological Image Analysis0
GANmut: Generating and Modifying Facial Expressions0
h4rm3l: A language for Composable Jailbreak Attack Synthesis0
BLAZE: Blazing Fast Privacy-Preserving Machine Learning0
BLADE: Benchmark suite for LLM-driven Automated Design and Evolution of iterative optimisation heuristics0
Black-Box Optimization Revisited: Improving Algorithm Selection Wizards through Massive Benchmarking0
BENCHIP: Benchmarking Intelligence Processors0
A Multi-Task Deep Learning Approach for Sensor-based Human Activity Recognition and Segmentation0
Black-box Bayesian inference for economic agent-based models0
BIQ2021: A Large-Scale Blind Image Quality Assessment Database0
Adaptive Deep Kernel Learning0
A Multisensory Learning Architecture for Rotation-invariant Object Recognition0
FeB4RAG: Evaluating Federated Search in the Context of Retrieval Augmented Generation0
FeDa4Fair: Client-Level Federated Datasets for Fairness Evaluation0
FedAD-Bench: A Unified Benchmark for Federated Unsupervised Anomaly Detection in Tabular Data0
BenchCouncil's View on Benchmarking AI and Other Emerging Workloads0
Biomedical image analysis competitions: The state of current participation practice0
Adaptive Control of an Inverted Pendulum by a Reinforcement Learning-based LQR Method0
Biological Valuation Map of Flanders: A Sentinel-2 Imagery Analysis0
@Bench: Benchmarking Vision-Language Models for Human-centered Assistive Technology0
Biologically Plausible Learning on Neuromorphic Hardware Architectures0
Bio-Image Informatics Index BIII: A unique database of image analysis tools and workflows for and by the bioimaging community0
A Multi-rater Comparative Study of Automatic Target Localization Methods for Epilepsy Deep Brain Stimulation Procedures0
A Benchmark for Spray from Nearby Cutting Vehicles0
BioDSA-1K: Benchmarking Data Science Agents for Biomedical Research0
A Multimodal, Full-Surround Vehicular Testbed for Naturalistic Studies and Benchmarking: Design, Calibration and Deployment0
Binary Classification with Positive Labeling Sources0
Benanza: Automatic μBenchmark Generation to Compute "Lower-bound" Latency and Inform Optimizations of Deep Learning Models on GPUs0
Featuremetric benchmarking: Quantum computer benchmarks based on circuit features0
A Multi-Labeled Dataset for Indonesian Discourse: Examining Toxicity, Polarization, and Demographics Information0
BigDataBench: A Scalable and Unified Big Data and AI Benchmark Suite0
BelHouse3D: A Benchmark Dataset for Assessing Occlusion Robustness in 3D Point Cloud Semantic Segmentation0
Feature Encodings for Gradient Boosting with Automunge0
Feature Selection and Classification of Hyperspectral Images With Support Vector Machines0
Bi-Discriminator Class-Conditional Tabular GAN0
Bi-DCSpell: A Bi-directional Detector-Corrector Interactive Framework for Chinese Spelling Check0
Behavior Structformer: Learning Players Representations with Structured Tokenization0
BEHAVIOR in Habitat 2.0: Simulator-Independent Logical Task Description for Benchmarking Embodied AI Agents0
BIAS: Transparent reporting of biomedical image analysis challenges0
AM-RADIO: Agglomerative Vision Foundation Model Reduce All Domains Into One0
A Benchmark for Out of Distribution Detection in Point Cloud 3D Semantic Segmentation0
Feature-based Evolutionary Diversity Optimization of Discriminating Instances for Chance-constrained Optimization Problems0
Feature selection in linear SVMs via a hard cardinality constraint: a scalable SDP decomposition approach0
Bias Mitigation for Machine Learning Classifiers: A Comprehensive Survey0
Beyond Visual Understanding: Introducing PARROT-360V for Vision Language Model Benchmarking0
Beyond Uniform Lipschitz Condition in Differentially Private Optimization0
FAVOR-Bench: A Comprehensive Benchmark for Fine-Grained Video Motion Understanding0
Beyond the Singular: The Essential Role of Multiple Generations in Effective Benchmark Evaluation and Analysis0
Beyond the Hype: Benchmarking LLM-Evolved Heuristics for Bin Packing0
Show:102550
← PrevPage 45 of 111Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified