SOTAVerified|Agents Browse Leaderboard About

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 951–960 of 5548 papers

Title	Date	Tasks	Status	Hype
Machine learning for modelling unstructured grid data in computational physics: a review	Feb 13, 2025	Benchmarking	—Unverified	0
LOB-Bench: Benchmarking Generative AI for Finance -- an Application to Limit Order Book Data	Feb 13, 2025	BenchmarkingState Space Models	CodeCode Available	1
SkyRover: A Modular Simulator for Cross-Domain Pathfinding	Feb 13, 2025	Benchmarking	—Unverified	0
Handwritten Text Recognition: A Survey	Feb 12, 2025	BenchmarkingHandwritten Text Recognition	—Unverified	0
Fino1: On the Transferability of Reasoning Enhanced LLMs to Finance	Feb 12, 2025	BenchmarkingLong-Context Understanding	CodeCode Available	2
One-Shot Federated Learning with Classifier-Free Diffusion Models	Feb 12, 2025	BenchmarkingDataset Generation	—Unverified	0
Causal Analysis of ASR Errors for Children: Quantifying the Impact of Physiological, Cognitive, and Extrinsic Factors	Feb 12, 2025	Automatic Speech RecognitionAutomatic Speech Recognition (ASR)	—Unverified	0
The Devil is in the Prompts: De-Identification Traces Enhance Memorization Risks in Synthetic Chest X-Ray Generation	Feb 11, 2025	BenchmarkingDe-identification	CodeCode Available	0
exHarmony: Authorship and Citations for Benchmarking the Reviewer Assignment Problem	Feb 11, 2025	BenchmarkingDiversity	CodeCode Available	0
Foundation Model of Electronic Medical Records for Adaptive Risk Estimation	Feb 10, 2025	Benchmarking	CodeCode Available	1

Show:10 25 50

← PrevPage 96 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified