SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2931–2940 of 5548 papers

Title	Date	Tasks	Status	Hype
MuSe-GNN: Learning Unified Gene Representation From Multimodal Biological Graph Data	Sep 29, 2023	BenchmarkingContrastive Learning	CodeCode Available	1
Sarcasm in Sight and Sound: Benchmarking and Expansion to Improve Multimodal Sarcasm Detection	Sep 29, 2023	BenchmarkingDiversity	—Unverified	0
FedAIoT: A Federated Learning Benchmark for Artificial Intelligence of Things	Sep 29, 2023	BenchmarkingFederated Learning	CodeCode Available	1
Optimizing with Low Budgets: a Comparison on the Black-box Optimization Benchmarking Suite and OpenAI Gym	Sep 29, 2023	Bayesian OptimizationBenchmarking	—Unverified	0
Benchmarking Collaborative Learning Methods Cost-Effectiveness for Prostate Segmentation	Sep 29, 2023	BenchmarkingFederated Learning	—Unverified	0
Benchmarking the Abilities of Large Language Models for RDF Knowledge Graph Creation and Comprehension: How Well Do LLMs Speak Turtle?	Sep 29, 2023	BenchmarkingKnowledge Graph Completion	CodeCode Available	1
Benchmarking Cognitive Biases in Large Language Models as Evaluators	Sep 29, 2023	BenchmarkingIn-Context Learning	CodeCode Available	1
Benchmarking and In-depth Performance Study of Large Language Models on Habana Gaudi Processors	Sep 29, 2023	BenchmarkingComputational Efficiency	—Unverified	0
A rigorous benchmarking of methods for SARS-CoV-2 lineage abundance estimation in wastewater	Sep 29, 2023	Benchmarking	—Unverified	0
Intuitive or Dependent? Investigating LLMs' Behavior Style to Conflicting Prompts	Sep 29, 2023	BenchmarkingDecision Making	—Unverified	0

Show:10 25 50

← PrevPage 294 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified