SOTAVerified

Benchmarking

Papers

Showing 411420 of 5548 papers

TitleStatusHype
Relation Extraction Across Entire Books to Reconstruct Community Networks: The AffilKG Datasets0
TCC-Bench: Benchmarking the Traditional Chinese Culture Understanding Capabilities of MLLMsCode0
M4-SAR: A Multi-Resolution, Multi-Polarization, Multi-Scene, Multi-Source Dataset and Benchmark for Optical-SAR Fusion Object DetectionCode1
Benchmarking performance, explainability, and evaluation strategies of vision-language models for surgery: Challenges and opportunities0
Words That Unite The World: A Unified Framework for Deciphering Central Bank Communications GloballyCode1
Model Performance-Guided Evaluation Data Selection for Effective Prompt Optimization0
GNN-Suite: a Graph Neural Network Benchmarking Framework for Biomedical InformaticsCode0
On the Evaluation of Engineering Artificial General Intelligence0
Real-World fNIRS-Based Brain-Computer Interfaces: Benchmarking Deep Learning and Classical Models in Interactive Gaming0
DIF: A Framework for Benchmarking and Verifying Implicit Bias in LLMs0
Show:102550
← PrevPage 42 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified