SOTAVerified

Benchmarking

Papers

Showing 29312940 of 5548 papers

TitleStatusHype
MuSe-GNN: Learning Unified Gene Representation From Multimodal Biological Graph DataCode1
Sarcasm in Sight and Sound: Benchmarking and Expansion to Improve Multimodal Sarcasm Detection0
FedAIoT: A Federated Learning Benchmark for Artificial Intelligence of ThingsCode1
Optimizing with Low Budgets: a Comparison on the Black-box Optimization Benchmarking Suite and OpenAI Gym0
Benchmarking Collaborative Learning Methods Cost-Effectiveness for Prostate Segmentation0
Benchmarking the Abilities of Large Language Models for RDF Knowledge Graph Creation and Comprehension: How Well Do LLMs Speak Turtle?Code1
Benchmarking Cognitive Biases in Large Language Models as EvaluatorsCode1
Benchmarking and In-depth Performance Study of Large Language Models on Habana Gaudi Processors0
A rigorous benchmarking of methods for SARS-CoV-2 lineage abundance estimation in wastewater0
Intuitive or Dependent? Investigating LLMs' Behavior Style to Conflicting Prompts0
Show:102550
← PrevPage 294 of 555Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GPT-4 TurboACC0.56Unverified