SOTAVerified

Benchmarking

Papers

Showing 821830 of 5548 papers

TitleStatusHype
Generative Wind Power Curve Modeling Via Machine Vision: A Self-learning Deep Convolutional Network Based MethodCode1
GENEVA: Benchmarking Generalizability for Event Argument Extraction with Hundreds of Event Types and Argument RolesCode1
Clinical Prompt Learning with Frozen Language ModelsCode1
CO-Bench: Benchmarking Language Model Agents in Algorithm Search for Combinatorial OptimizationCode1
Depth-Driven Geometric Prompt Learning for Laparoscopic Liver Landmark DetectionCode1
CIBench: Evaluating Your LLMs with a Code Interpreter PluginCode1
German's Next Language ModelCode1
German Text Embedding Clustering BenchmarkCode1
4D Panoptic LiDAR SegmentationCode1
CIDEr: Consensus-based Image Description EvaluationCode1
Show:102550
← PrevPage 83 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified