SOTAVerified

Benchmarking

Papers

Showing 41264150 of 5548 papers

TitleStatusHype
Person Search by Multi-Scale Matching0
Person Search by Multi-Scale Matching0
Perspective on recent developments and challenges in regulatory and systems genomics0
Perspectives on the State and Future of Deep Learning -- 20230
Perturbation-based exploration methods in deep reinforcement learning0
Benchmark Analysis of Various Pre-trained Deep Learning Models on ASSIRA Cats and Dogs Dataset0
BENCHIP: Benchmarking Intelligence Processors0
PGLearn -- An Open-Source Learning Toolkit for Optimal Power Flow0
PGLib-CO2: A Power Grid Library for Computing and Optimizing Carbon Emissions0
BenchCouncil's View on Benchmarking AI and Other Emerging Workloads0
PhD Thesis on Code Modulated Interferometric Imaging System using Phased Arrays0
Phi-3 Safety Post-Training: Aligning Language Models with a "Break-Fix" Cycle0
PhilHumans: Benchmarking Machine Learning for Personal Health0
@Bench: Benchmarking Vision-Language Models for Human-centered Assistive Technology0
PhysBench: Benchmarking and Enhancing Vision-Language Models for Physical World Understanding0
PhySense: Principle-Based Physics Reasoning Benchmarking for Large Language Models0
Physics-Learning AI Datamodel (PLAID) datasets: a collection of physics simulations for machine learning0
Benanza: Automatic μBenchmark Generation to Compute "Lower-bound" Latency and Inform Optimizations of Deep Learning Models on GPUs0
PhytoSynth: Leveraging Multi-modal Generative Models for Crop Disease Data Generation with Novel Benchmarking and Prompt Engineering Approach0
BelHouse3D: A Benchmark Dataset for Assessing Occlusion Robustness in 3D Point Cloud Semantic Segmentation0
Behavior Structformer: Learning Players Representations with Structured Tokenization0
Yesil o1 Pro: Evidence-Based AI Model for Health and Benchmarking in Clinical Decision Support0
PieTrack: An MOT solution based on synthetic data training and self-supervised domain adaptation0
BEHAVIOR in Habitat 2.0: Simulator-Independent Logical Task Description for Benchmarking Embodied AI Agents0
Turbulence in Focus: Benchmarking Scaling Behavior of 3D Volumetric Super-Resolution with BLASTNet 2.0 Data0
Show:102550
← PrevPage 166 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified