SOTAVerified

Benchmarking

Papers

Showing 541550 of 5548 papers

TitleStatusHype
ZNO-Eval: Benchmarking reasoning capabilities of large language models in UkrainianCode1
Retrieval-Augmented Dialogue Knowledge Aggregation for Expressive Conversational Speech SynthesisCode1
DiffuSETS: 12-lead ECG Generation Conditioned on Clinical Text Reports and Patient-Specific InformationCode1
VoxEval: Benchmarking the Knowledge Understanding Capabilities of End-to-End Spoken Language ModelsCode1
Underwater Image Restoration Through a Prior Guided Hybrid Sense Approach and Extensive Benchmark AnalysisCode1
CySecBench: Generative AI-based CyberSecurity-focused Prompt Dataset for Benchmarking Large Language ModelsCode1
TrajLearn: Trajectory Prediction Learning using Deep Generative ModelsCode1
SMAC-Hard: Enabling Mixed Opponent Strategy Script and Self-play on SMACCode1
On the Generalization Ability of Machine-Generated Text DetectorsCode1
Generative CKM Construction using Partially Observed Data with Diffusion ModelCode1
Show:102550
← PrevPage 55 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified