SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4651–4660 of 5548 papers

Title	Date	Tasks	Status	Hype
Mamba-Based Ensemble learning for White Blood Cell Classification	Apr 15, 2025	BenchmarkingClassification	CodeCode Available	0
Better Late Than Never: Formulating and Benchmarking Recommendation Editing	Jun 6, 2024	BenchmarkingRecommendation Systems	CodeCode Available	0
Better force fields start with better data -- A data set of cation dipeptide interactions	Jul 19, 2021	Benchmarking	CodeCode Available	0
MANTRA: The Manifold Triangulations Assemblage	Oct 3, 2024	Benchmarking	CodeCode Available	0
BeSt-LeS: Benchmarking Stroke Lesion Segmentation using Deep Supervision	Oct 10, 2023	Acute Stroke Lesion SegmentationBenchmarking	CodeCode Available	0
debiaSAE: Benchmarking and Mitigating Vision-Language Model Bias	Oct 17, 2024	BenchmarkingBias Detection	CodeCode Available	0
VizSeq: A Visual Analysis Toolkit for Text Generation Tasks	Sep 12, 2019	BenchmarkingImage Captioning	CodeCode Available	0
PATH: A Discrete-sequence Dataset for Evaluating Online Unsupervised Anomaly Detection Approaches for Multivariate Time Series	Nov 21, 2024	Anomaly DetectionBenchmarking	CodeCode Available	0
Hi Guys or Hi Folks? Benchmarking Gender-Neutral Machine Translation with the GeNTE Corpus	Oct 8, 2023	BenchmarkingMachine Translation	CodeCode Available	0
Margin-bounded Confidence Scores for Out-of-Distribution Detection	Sep 22, 2024	Autonomous DrivingBenchmarking	CodeCode Available	0

Show:10 25 50

← PrevPage 466 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified