SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2731–2740 of 5548 papers

Title	Date	Tasks	Status	Hype
ASI: Accuracy-Stability Index for Evaluating Deep Learning Models	Nov 26, 2023	BenchmarkingDeep Learning	—Unverified	0
An Empirical Investigation into Benchmarking Model Multiplicity for Trustworthy Machine Learning: A Case Study on Image Classification	Nov 24, 2023	Benchmarkingimage-classification	—Unverified	0
Benchmarking Robustness of Text-Image Composed Retrieval	Nov 24, 2023	AttributeBenchmarking	CodeCode Available	1
Large Language Models as Automated Aligners for benchmarking Vision-Language Models	Nov 24, 2023	BenchmarkingWorld Knowledge	—Unverified	0
Dialogue Quality and Emotion Annotations for Customer Support Conversations	Nov 23, 2023	BenchmarkingDiversity	CodeCode Available	0
Learning Dynamic Selection and Pricing of Out-of-Home Deliveries	Nov 23, 2023	BenchmarkingDecision Making	CodeCode Available	0
Creating and Leveraging a Synthetic Dataset of Cloud Optical Thickness Measures for Cloud Detection in MSI	Nov 23, 2023	BenchmarkingCloud Detection	CodeCode Available	0
Automated 3D Tumor Segmentation using Temporal Cubic PatchGAN (TCuP-GAN)	Nov 23, 2023	BenchmarkingBrain Tumor Segmentation	—Unverified	0
PG-Video-LLaVA: Pixel Grounding Large Video-Language Models	Nov 22, 2023	BenchmarkingPhrase Grounding	CodeCode Available	2
Benchmarking Toxic Molecule Classification using Graph Neural Networks and Few Shot Learning	Nov 22, 2023	BenchmarkingDrug Discovery	—Unverified	0

Show:10 25 50

← PrevPage 274 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified