SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 3691–3700 of 5548 papers

Title	Date	Tasks	Status	Hype	Score
Motion-Grounded Video Reasoning: Understanding and Perceiving Motion at Pixel Level	Nov 15, 2024	Benchmarkingcounterfactual	—Unverified	0	0
A Dataset for Benchmarking Image-Based Localization	Jul 1, 2017	BenchmarkingImage-Based Localization	—Unverified	0	0
Movie Description	May 12, 2016	Benchmarking	—Unverified	0	0
MoviePuzzle: Visual Narrative Reasoning through Multimodal Order Learning	Jun 4, 2023	BenchmarkingContrastive Learning	—Unverified	0	0
Moving Beyond Downstream Task Accuracy for Information Retrieval Benchmarking	Dec 2, 2022	BenchmarkingInformation Retrieval	—Unverified	0	0
MozzaVID: Mozzarella Volumetric Image Dataset	Dec 6, 2024	BenchmarkingComputed Tomography (CT)	—Unverified	0	0
MPCLeague: Robust MPC Platform for Privacy-Preserving Machine Learning	Dec 26, 2021	BenchmarkingBIG-bench Machine Learning	—Unverified	0	0
MRAnnotator: multi-Anatomy and many-Sequence MRI segmentation of 44 structures	Feb 1, 2024	AnatomyBenchmarking	—Unverified	0	0
MSAMSum: Towards Benchmarking Multi-lingual Dialogue Summarization	Nov 16, 2021	Benchmarkingdialogue summary	—Unverified	0	0
Towards responsible AI for education: Hybrid human-AI to confront the Elephant in the room	Apr 22, 2025	BenchmarkingFairness	—Unverified	0	0

Show:10 25 50

← PrevPage 370 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified