SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4941–4950 of 5548 papers

Title	Date	Tasks	Status	Hype
ExEBench: Benchmarking Foundation Models on Extreme Earth Events	May 13, 2025	BenchmarkingManagement	CodeCode Available	0
MULTITAT: Benchmarking Multilingual Table-and-Text Question Answering	Feb 24, 2025	BenchmarkingQuestion Answering	CodeCode Available	0
Evolving Evolutionary Algorithms with Patterns	Oct 10, 2021	BenchmarkingEvolutionary Algorithms	CodeCode Available	0
Semantic Hilbert Space for Text Representation Learning	Feb 26, 2019	BenchmarkingGeneral Classification	CodeCode Available	0
A Continuous Information Gain Measure to Find the Most Discriminatory Problems for AI Benchmarking	Sep 9, 2018	BenchmarkingGame Design	CodeCode Available	0
Timage -- A Robust Time Series Classification Pipeline	Sep 19, 2019	BenchmarkingClassification	CodeCode Available	0
AttackNet: Enhancing Biometric Security via Tailored Convolutional Neural Network Architectures for Liveness Detection	Feb 6, 2024	Benchmarking	CodeCode Available	0
EvoLearner: Learning Description Logics with Evolutionary Algorithms	Nov 8, 2021	BenchmarkingEvolutionary Algorithms	CodeCode Available	0
Evidential Deep Learning for Uncertainty Quantification and Out-of-Distribution Detection in Jet Identification using Deep Neural Networks	Jan 10, 2025	Anomaly DetectionBenchmarking	CodeCode Available	0
Integrating Large Language Models and Knowledge Graphs for Extraction and Validation of Textual Test Data	Aug 3, 2024	BenchmarkingKnowledge Graphs	CodeCode Available	0

Show:10 25 50

← PrevPage 495 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified