SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1881–1890 of 5548 papers

Title	Date	Tasks	Status	Hype
Are Synthetic Corruptions A Reliable Proxy For Real-World Corruptions?	May 7, 2025	BenchmarkingSemantic Segmentation	CodeCode Available	0
Towards Efficient Benchmarking of Foundation Models in Remote Sensing: A Capabilities Encoding Approach	May 6, 2025	BenchmarkingEarth Observation	CodeCode Available	0
MedArabiQ: Benchmarking Large Language Models on Arabic Medical Tasks	May 6, 2025	BenchmarkingMultiple-choice	CodeCode Available	0
Call for Action: towards the next generation of symbolic regression benchmark	May 6, 2025	BenchmarkingDiversity	—Unverified	0
Multimodal Benchmarking and Recommendation of Text-to-Image Generation Models	May 6, 2025	BenchmarkingImage Generation	CodeCode Available	0
NeuroSim V1.5: Improved Software Backbone for Benchmarking Compute-in-Memory Accelerators with Device and Circuit-level Non-idealities	May 5, 2025	BenchmarkingQuantization	CodeCode Available	0
Physics-Learning AI Datamodel (PLAID) datasets: a collection of physics simulations for machine learning	May 5, 2025	Benchmarking	—Unverified	0
Completing Spatial Transcriptomics Data for Gene Expression Prediction Benchmarking	May 5, 2025	BenchmarkingPrediction	—Unverified	0
Benchmarking Feature Upsampling Methods for Vision Foundation Models using Interactive Segmentation	May 4, 2025	BenchmarkingFeature Upsampling	CodeCode Available	0
Representation Learning of Limit Order Book: A Comprehensive Study and Benchmarking	May 4, 2025	BenchmarkingRepresentation Learning	CodeCode Available	0

Show:10 25 50

← PrevPage 189 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified