SOTAVerified

Benchmarking

Papers

Showing 36263650 of 5548 papers

TitleStatusHype
TRAM: Benchmarking Temporal Reasoning for Large Language Models0
Adaptive Visual Scene Understanding: Incremental Scene Graph GenerationCode0
The Sparsity Roofline: Understanding the Hardware Limits of Sparse Neural Networks0
Adaptive Control of an Inverted Pendulum by a Reinforcement Learning-based LQR Method0
Benchmarking Collaborative Learning Methods Cost-Effectiveness for Prostate Segmentation0
A rigorous benchmarking of methods for SARS-CoV-2 lineage abundance estimation in wastewater0
Intuitive or Dependent? Investigating LLMs' Behavior Style to Conflicting Prompts0
Sarcasm in Sight and Sound: Benchmarking and Expansion to Improve Multimodal Sarcasm Detection0
Benchmarking and In-depth Performance Study of Large Language Models on Habana Gaudi Processors0
Optimizing with Low Budgets: a Comparison on the Black-box Optimization Benchmarking Suite and OpenAI Gym0
Language Models as a Service: Overview of a New Paradigm and its Challenges0
Demographic Parity: Mitigating Biases in Real-World Data0
On quantifying and improving realism of images generated with diffusion0
Advancing The Rate-Distortion-Computation Frontier For Neural Image Compression0
Thalamic nuclei segmentation from T_1-weighted MRI: unifying and benchmarking state-of-the-art methods with young and old cohorts0
Optimization Techniques for a Physical Model of Human Vocalisation0
Efficient Pauli channel estimation with logarithmic quantum memory0
VisionKG: Unleashing the Power of Visual Datasets via Knowledge Graph0
Categorization and analysis of 14 computational methods for estimating cell potency from single-cell RNA-seq data0
Machine-assisted quantitizing designs: augmenting humanities and social sciences with artificial intelligenceCode0
Turbulence in Focus: Benchmarking Scaling Behavior of 3D Volumetric Super-Resolution with BLASTNet 2.0 Data0
Domain Adaptation for Arabic Machine Translation: The Case of Financial Texts0
Multimodal Deep Learning for Scientific Imaging Interpretation0
Benchmarking quantized LLaMa-based models on the Brazilian Secondary School Exam0
On the relationship between Benchmarking, Standards and Certification in Robotics and AI0
Show:102550
← PrevPage 146 of 222Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified