SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1571–1580 of 5548 papers

Title	Date	Tasks	Status	Hype	Score
An Optical Control Environment for Benchmarking Reinforcement Learning Algorithms	Mar 23, 2022	BenchmarkingDeep Reinforcement Learning	CodeCode Available	0	5
Keep Security! Benchmarking Security Policy Preservation in Large Language Model Contexts Against Indirect Attacks in Question Answering	May 21, 2025	BenchmarkingLanguage Modeling	CodeCode Available	0	5
KamNet: An Integrated Spatiotemporal Deep Neural Network for Rare Event Search in KamLAND-Zen	Mar 3, 2022	Benchmarking	CodeCode Available	0	5
KArSL: Arabic Sign Language Database	Jan 1, 2021	BenchmarkingSign Language Recognition	CodeCode Available	0	5
Benchmarking Deep Learning Models on NVIDIA Jetson Nano for Real-Time Systems: An Empirical Investigation	Jun 25, 2024	Action DetectionBenchmarking	CodeCode Available	0	5
An open unified deep graph learning framework for discovering drug leads	Dec 6, 2022	BenchmarkingDrug Discovery	CodeCode Available	0	5
Advancing and Benchmarking Personalized Tool Invocation for LLMs	May 7, 2025	BenchmarkingWorld Knowledge	CodeCode Available	0	5
Joint Multi-Scale Tone Mapping and Denoising for HDR Image Enhancement	Mar 16, 2023	BenchmarkingDemosaicking	CodeCode Available	0	5
Benchmarking Robustness of Deep Learning Classifiers Using Two-Factor Perturbation	Mar 2, 2021	BenchmarkingDeep Learning	CodeCode Available	0	5
Can LLMs replace Neil deGrasse Tyson? Evaluating the Reliability of LLMs as Science Communicators	Sep 21, 2024	Benchmarking	CodeCode Available	0	5

Show:10 25 50

← PrevPage 158 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified