SOTAVerified|Agents Browse Leaderboard About

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1261–1270 of 5548 papers

Title	Date	Tasks	Status	Hype
PASS: An ImageNet replacement for self-supervised pretraining without humans	Sep 27, 2021	BenchmarkingEthics	CodeCode Available	1
Disentangled Feature Representation for Few-shot Image Classification	Sep 26, 2021	BenchmarkingClassification	CodeCode Available	1
Don't be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogue System	Sep 23, 2021	BenchmarkingResponse Generation	CodeCode Available	1
SubseasonalClimateUSA: A Dataset for Subseasonal Forecasting and Benchmarking	Sep 21, 2021	Benchmarking	CodeCode Available	1
Benchmarking the Combinatorial Generalizability of Complex Query Answering on Knowledge Graphs	Sep 18, 2021	BenchmarkingComplex Query Answering	CodeCode Available	1
AI Accelerator Survey and Trends	Sep 18, 2021	BenchmarkingComputational Efficiency	CodeCode Available	1
Benchmarking Commonsense Knowledge Base Population with an Effective Evaluation Dataset	Sep 16, 2021	BenchmarkingKnowledge Base Population	CodeCode Available	1
OPV2V: An Open Benchmark Dataset and Fusion Pipeline for Perception with Vehicle-to-Vehicle Communication	Sep 16, 2021	3D Object DetectionBenchmarking	CodeCode Available	1
Benchmarking the Spectrum of Agent Capabilities	Sep 14, 2021	Benchmarking	CodeCode Available	1
RobustART: Benchmarking Robustness on Architecture Design and Training Techniques	Sep 11, 2021	Adversarial RobustnessBenchmarking	CodeCode Available	1

Show:10 25 50

← PrevPage 127 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified