SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1731–1740 of 5548 papers

Title	Date	Tasks	Status	Hype	Score
KhabarChin: Automatic Detection of Important News in the Persian Language	Dec 6, 2023	ArticlesBenchmarking	CodeCode Available	0	5
Can LLMs Grasp Implicit Cultural Values? Benchmarking LLMs' Metacognitive Cultural Intelligence with CQ-Bench	Apr 1, 2025	Benchmarking	CodeCode Available	0	5
InViG: Benchmarking Interactive Visual Grounding with 500K Human-Robot Interactions	Oct 18, 2023	BenchmarkingVisual Grounding	CodeCode Available	0	5
Introducing SLAMBench, a performance and accuracy benchmarking methodology for SLAM	Oct 8, 2014	Benchmarking	CodeCode Available	0	5
Inverse Contextual Bandits: Learning How Behavior Evolves over Time	Jul 13, 2021	BenchmarkingDecision Making	CodeCode Available	0	5
Investigating the Impact of Hard Samples on Accuracy Reveals In-class Data Imbalance	Sep 22, 2024	AutoMLBenchmarking	CodeCode Available	0	5
INTERSPEECH 2009 Emotion Challenge Revisited: Benchmarking 15 Years of Progress in Speech Emotion Recognition	Jun 10, 2024	BenchmarkingEmotion Recognition	CodeCode Available	0	5
CLEAVE: Scalable and Edge-native Benchmarking of Networked Control Systems	Apr 5, 2022	BenchmarkingEdge-computing	CodeCode Available	0	5
Can geometric combinatorics improve RNA branching predictions?	Mar 26, 2025	Benchmarking	CodeCode Available	0	5
Air Learning: A Deep Reinforcement Learning Gym for Autonomous Aerial Robot Visual Navigation	Jun 2, 2019	BenchmarkingDeep Reinforcement Learning	CodeCode Available	0	5

Show:10 25 50

← PrevPage 174 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified