SOTAVerified

Benchmarking

Papers

Showing 13761400 of 5548 papers

TitleStatusHype
Introducing Milabench: Benchmarking Accelerators for AICode1
Benchpress: A Scalable and Versatile Workflow for Benchmarking Structure Learning AlgorithmsCode1
BEND: Benchmarking DNA Language Models on biologically meaningful tasksCode1
Introducing the VoicePrivacy InitiativeCode1
BenchML: an extensible pipelining framework for benchmarking representations of materials and molecules at scaleCode1
Benchmarking the Robustness of Deep Neural Networks to Common Corruptions in Digital PathologyCode1
Benchmarking Implicit Neural Representation and Geometric Rendering in Real-Time RGB-D SLAMCode1
Benchmark on Drug Target Interaction Modeling from a Structure PerspectiveCode1
Benchmarks for Deep Off-Policy EvaluationCode1
Intrinsic Image HarmonizationCode1
Exploiting News Article Structure for Automatic Corpus Generation of Entailment DatasetsCode1
Align and Distill: Unifying and Improving Domain Adaptive Object DetectionCode1
Event-Free Moving Object Segmentation from Moving Ego VehicleCode1
Ducho 2.0: Towards a More Up-to-Date Unified Framework for the Extraction of Multimodal Features in RecommendationCode1
Benchmarking the Robustness of Spatial-Temporal Models Against CorruptionsCode1
Benchmarking Image Retrieval for Visual LocalizationCode1
ArabicaQA: A Comprehensive Dataset for Arabic Question AnsweringCode1
Benchmarking human visual search computational models in natural scenes: models comparison and reference datasetsCode1
Interpretable statistical representations of neural population dynamics and geometryCode1
InstructTTSEval: Benchmarking Complex Natural-Language Instruction Following in Text-to-Speech SystemsCode1
Dynatask: A Framework for Creating Dynamic AI Benchmark TasksCode1
Physiology-based simulation of the retinal vasculature enables annotation-free segmentation of OCT angiographsCode1
PIC4rl-gym: a ROS2 modular framework for Robots Autonomous Navigation with Deep Reinforcement LearningCode1
Aquatic Navigation: A Challenging Benchmark for Deep Reinforcement LearningCode1
IntelliGraphs: Datasets for Benchmarking Knowledge Graph GenerationCode1
Show:102550
← PrevPage 56 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified