SOTAVerified|Agents Browse Leaderboard About

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 771–780 of 5548 papers

Title	Date	Tasks	Status	Hype	Score
dMelodies: A Music Dataset for Disentanglement Learning	Jul 29, 2020	BenchmarkingDisentanglement	CodeCode Available	1	5
Benchmarking the Spectrum of Agent Capabilities	Sep 14, 2021	Benchmarking	CodeCode Available	1	5
Foundation Model of Electronic Medical Records for Adaptive Risk Estimation	Feb 10, 2025	Benchmarking	CodeCode Available	1	5
Benchmarking TinyML Systems: Challenges and Direction	Mar 10, 2020	BenchmarkingPosition	CodeCode Available	1	5
Benchmarking Transcriptomics Foundation Models for Perturbation Analysis : one PCA still rules them all	Oct 17, 2024	AllBenchmarking	CodeCode Available	1	5
fseval: A Benchmarking Framework for Feature Selection and Feature Ranking Algorithms	Nov 23, 2022	Automated Feature EngineeringBenchmarking	CodeCode Available	1	5
Benchmarking tree species classification from proximally-sensed laser scanning data: introducing the FOR-species20K dataset	Aug 12, 2024	Benchmarking	CodeCode Available	1	5
FullFront: Benchmarking MLLMs Across the Full Front-End Engineering Workflow	May 23, 2025	BenchmarkingCode Generation	CodeCode Available	1	5
Don’t be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogue System	Nov 1, 2021	BenchmarkingResponse Generation	CodeCode Available	1	5
Formalizing Multimedia Recommendation through Multimodal Deep Learning	Sep 11, 2023	BenchmarkingDeep Learning	CodeCode Available	1	5

Show:10 25 50

← PrevPage 78 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified