SOTAVerified|Agents Browse Leaderboard About

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 911–920 of 5548 papers

Title	Date	Tasks	Status	Hype
NLPBench: Evaluating Large Language Models on Solving NLP Problems	Sep 27, 2023	BenchmarkingMath	CodeCode Available	1
OceanBench: The Sea Surface Height Edition	Sep 27, 2023	BenchmarkingSensor Fusion	CodeCode Available	1
Benchmarking Local Robustness of High-Accuracy Binary Neural Networks for Enhanced Traffic Sign Recognition	Sep 25, 2023	Autonomous DrivingBenchmarking	CodeCode Available	1
Benchmarking Encoder-Decoder Architectures for Biplanar X-ray to 3D Shape Reconstruction	Sep 24, 2023	3D Shape ReconstructionAnatomy	CodeCode Available	1
Grad DFT: a software library for machine learning enhanced density functional theory	Sep 23, 2023	Benchmarking	CodeCode Available	1
Prompt Tuned Embedding Classification for Multi-Label Industry Sector Allocation	Sep 21, 2023	BenchmarkingClassification	CodeCode Available	1
An Image Dataset for Benchmarking Recommender Systems with Raw Pixels	Sep 13, 2023	BenchmarkingRecommendation Systems	CodeCode Available	1
Formalizing Multimedia Recommendation through Multimodal Deep Learning	Sep 11, 2023	BenchmarkingDeep Learning	CodeCode Available	1
FreeMan: Towards Benchmarking 3D Human Pose Estimation under Real-World Conditions	Sep 10, 2023	3D Human Pose Estimation3D Pose Estimation	CodeCode Available	1
RecAD: Towards A Unified Library for Recommender Attack and Defense	Sep 9, 2023	BenchmarkingRecommendation Systems	CodeCode Available	1

Show:10 25 50

← PrevPage 92 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified