SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2391–2400 of 5548 papers

Title	Date	Tasks	Status	Hype
Leveraging Spatial and Semantic Feature Extraction for Skin Cancer Diagnosis with Capsule Networks and Graph Neural Networks	Mar 18, 2024	BenchmarkingClassification	—Unverified	0
Benchmarking the Robustness of UAV Tracking Against Common Corruptions	Mar 18, 2024	Benchmarking	CodeCode Available	0
A Sober Look at the Robustness of CLIPs to Spurious Features	Mar 18, 2024	Benchmarking	—Unverified	0
FlowMind: Automatic Workflow Generation with LLMs	Mar 17, 2024	BenchmarkingQuestion Answering	—Unverified	0
Granular Change Accuracy: A More Accurate Performance Metric for Dialogue State Tracking	Mar 17, 2024	BenchmarkingDialogue State Tracking	—Unverified	0
Depression Detection on Social Media with Large Language Models	Mar 16, 2024	BenchmarkingDepression Detection	—Unverified	0
An Improved Metric and Benchmark for Assessing the Performance of Virtual Screening Models	Mar 15, 2024	BenchmarkingDrug Discovery	CodeCode Available	1
Benchmarking Adversarial Robustness of Image Shadow Removal with Shadow-adaptive Attacks	Mar 15, 2024	Adversarial AttackAdversarial Robustness	—Unverified	0
Histo-Genomic Knowledge Distillation For Cancer Prognosis From Histopathology Whole Slide Images	Mar 15, 2024	BenchmarkingKnowledge Distillation	CodeCode Available	1
Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot Study	Mar 15, 2024	Benchmarking	CodeCode Available	0

Show:10 25 50

← PrevPage 240 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified