SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4971–4980 of 5548 papers

Title	Date	Tasks	Status	Hype
Evaluating SAT and SMT Solvers on Large-Scale Sudoku Puzzles	Jan 15, 2025	Benchmarking	CodeCode Available	0
NbBench: Benchmarking Language Models for Comprehensive Nanobody Tasks	May 4, 2025	BenchmarkingRepresentation Learning	CodeCode Available	0
NCAdapt: Dynamic adaptation with domain-specific Neural Cellular Automata for continual hippocampus segmentation	Oct 30, 2024	BenchmarkingContinual Learning	CodeCode Available	0
A Systematic Review of Green AI	Jan 26, 2023	Benchmarking	CodeCode Available	0
Evaluating LLP Methods: Challenges and Approaches	Oct 29, 2023	BenchmarkingModel Selection	CodeCode Available	0
Evaluating Feature Attribution Methods in the Image Domain	Feb 22, 2022	Benchmarking	CodeCode Available	0
NegBio: a high-performance tool for negation and uncertainty detection in radiology reports	Dec 16, 2017	BenchmarkingNegation	CodeCode Available	0
A Comprehensive Comparison of Multi-Dimensional Image Denoising Methods	Nov 6, 2020	BenchmarkingDenoising	CodeCode Available	0
NeMig -- A Bilingual News Collection and Knowledge Graph about Migration	Sep 1, 2023	ArticlesBenchmarking	CodeCode Available	0
NengoDL: Combining deep learning and neuromorphic modelling methods	May 28, 2018	BenchmarkingDeep Learning	CodeCode Available	0

Show:10 25 50

← PrevPage 498 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified