SOTAVerified

Benchmarking

Papers

Showing 28012810 of 5548 papers

TitleStatusHype
GPTs and Language Barrier: A Cross-Lingual Legal QA Examination0
Beyond Chains of Thought: Benchmarking Latent-Space Reasoning Abilities in Large Language Models0
Beyond Black-Box Benchmarking: Observability, Analytics, and Optimization of Agentic Systems0
Variational Laplace for Bayesian neural networks0
Granite-speech: open-source speech-aware LLMs with strong English ASR capabilities0
Granular Change Accuracy: A More Accurate Performance Metric for Dialogue State Tracking0
Graph Alignment for Benchmarking Graph Neural Networks and Learning Positional Encodings0
Beyond Benchmarks: On The False Promise of AI Regulation0
Graph Attention-based Decentralized Actor-Critic for Dual-Objective Control of Multi-UAV Swarms0
Graph-based Deep-Tree Recursive Neural Network (DTRNN) for Text Classification0
Show:102550
← PrevPage 281 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified