SOTAVerified

Benchmarking

Papers

Showing 16511660 of 5548 papers

TitleStatusHype
CVC: A Large-Scale Chinese Value Rule Corpus for Value Alignment of Large Language ModelsCode0
Benchmarking Neural Speech Codec Intelligibility with SITool0
ResearchCodeBench: Benchmarking LLMs on Implementing Novel Machine Learning Research Code0
ACCESS DENIED INC: The First Benchmark Environment for Sensitivity AwarenessCode0
MedBookVQA: A Systematic and Comprehensive Medical Benchmark Derived from Open-Access BookCode0
ModuLM: Enabling Modular and Multimodal Molecular Relational Learning with Large Language Models0
The iNaturalist Sounds Dataset0
Benchmarking Foundation Models for Zero-Shot Biometric Tasks0
GenSpace: Benchmarking Spatially-Aware Image Generation0
Progressive Class-level Distillation0
Show:102550
← PrevPage 166 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified