SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3901–3910 of 5548 papers

Title	Date	Tasks	Status	Hype
Improving Items and Contexts Understanding with Descriptive Graph for Conversational Recommendation	Apr 11, 2023	BenchmarkingConversational Recommendation	—Unverified	0
Benchmarking the Physical-world Adversarial Robustness of Vehicle Detection	Apr 11, 2023	Adversarial AttackAdversarial Robustness	—Unverified	0
Certifiable Black-Box Attacks with Randomized Adversarial Examples: Breaking Defenses with Provable Confidence	Apr 10, 2023	Benchmarkingspeech-recognition	CodeCode Available	0
ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit	Apr 10, 2023	BenchmarkingSimultaneous Speech-to-Text Translation	—Unverified	0
On Evaluation of Bangla Word Analogies	Apr 10, 2023	BenchmarkingWord Embeddings	—Unverified	0
ForamViT-GAN: Exploring New Paradigms in Deep Learning for Micropaleontological Image Analysis	Apr 9, 2023	BenchmarkingDeep Learning	—Unverified	0
SimbaML: Connecting Mechanistic Models and Machine Learning with Augmented Data	Apr 8, 2023	BenchmarkingData Augmentation	CodeCode Available	0
Benchmarking the Robustness of Quantized Models	Apr 8, 2023	BenchmarkingQuantization	—Unverified	0
Probing Conceptual Understanding of Large Visual-Language Models	Apr 7, 2023	Benchmarking	CodeCode Available	0
Benchmarking Robustness to Text-Guided Corruptions	Apr 6, 2023	BenchmarkingData Augmentation	CodeCode Available	0

Show:10 25 50

← PrevPage 391 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified