SOTAVerified

Benchmarking

Papers

Showing 26012610 of 5548 papers

TitleStatusHype
Image2Struct: Benchmarking Structure Extraction for Vision-Language Models0
SS3DM: Benchmarking Street-View Surface Reconstruction with a Synthetic 3D Mesh Dataset0
AI Cyber Risk Benchmark: Automated Exploitation Capabilities0
Benchmarking LLM Guardrails in Handling Multilingual Toxicity0
Benchmarking Human and Automated Prompting in the Segment Anything ModelCode0
CURATe: Benchmarking Personalised Alignment of Conversational AI AssistantsCode0
Project MPG: towards a generalized performance benchmark for LLM capabilities0
CODES: Benchmarking Coupled ODE SurrogatesCode0
Rephrasing natural text data with different languages and quality levels for Large Language Model pre-training0
BongLLaMA: LLaMA for Bangla Language0
Show:102550
← PrevPage 261 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified