SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2321–2330 of 5548 papers

Title	Date	Tasks	Status	Hype
Benchmarking zero-shot stance detection with FlanT5-XXL: Insights from training data, prompting, and decoding strategies into its near-SoTA performance	Mar 1, 2024	BenchmarkingStance Detection	—Unverified	0
ALT: A Python Package for Lightweight Feature Representation in Time Series Classification	Apr 17, 2025	BenchmarkingTime Series	—Unverified	0
FISBe: A Real-World Benchmark Dataset for Instance Segmentation of Long-Range Thin Filamentous Structures	Jan 1, 2024	BenchmarkingInstance Segmentation	—Unverified	0
Benchmarking zero-shot and few-shot approaches for tokenization, tagging, and dependency parsing of Tagalog text	Aug 3, 2022	BenchmarkingData Augmentation	—Unverified	0
Benchmarking YOLOv8 for Optimal Crack Detection in Civil Infrastructure	Jan 12, 2025	BenchmarkingHyperparameter Optimization	—Unverified	0
AV-Reasoner: Improving and Benchmarking Clue-Grounded Audio-Visual Counting for MLLMs	Jun 5, 2025	BenchmarkingVideo Understanding	—Unverified	0
Benchmarking XAI Explanations with Human-Aligned Evaluations	Nov 4, 2024	Benchmarking	—Unverified	0
A critical look at the current train/test split in machine learning	Jun 8, 2021	Active LearningBenchmarking	—Unverified	0
FinTMMBench: Benchmarking Temporal-Aware Multi-Modal RAG in Finance	Mar 7, 2025	ArticlesBenchmarking	—Unverified	0
FixCLR: Negative-Class Contrastive Learning for Semi-Supervised Domain Generalization	Jun 25, 2025	BenchmarkingContrastive Learning	—Unverified	0

Show:10 25 50

← PrevPage 233 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified