SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2551–2560 of 5548 papers

Title	Date	Tasks	Status	Hype
The ParClusterers Benchmark Suite (PCBS): A Fine-Grained Analysis of Scalable Graph Clustering	Nov 15, 2024	BenchmarkingClustering	—Unverified	0
WelQrate: Defining the Gold Standard in Small Molecule Drug Discovery Benchmarking	Nov 14, 2024	BenchmarkingDrug Discovery	—Unverified	0
BEARD: Benchmarking the Adversarial Robustness for Dataset Distillation	Nov 14, 2024	Adversarial AttackAdversarial Robustness	CodeCode Available	0
A survey of probabilistic generative frameworks for molecular simulations	Nov 14, 2024	BenchmarkingDenoising	CodeCode Available	0
HyperFace: Generating Synthetic Face Recognition Datasets by Exploring Face Embedding Hypersphere	Nov 13, 2024	BenchmarkingDataset Generation	—Unverified	0
Anomaly Detection in Large-Scale Cloud Systems: An Industry Case and Dataset	Nov 13, 2024	Anomaly DetectionBenchmarking	CodeCode Available	0
A Survey on Vision Autoregressive Model	Nov 13, 2024	3D GenerationBenchmarking	—Unverified	0
Evaluating the Generation of Spatial Relations in Text and Image Generative Models	Nov 12, 2024	BenchmarkingImage Generation	—Unverified	0
BuckTales : A multi-UAV dataset for multi-object tracking and re-identification of wild antelopes	Nov 11, 2024	BenchmarkingMulti-Object Tracking	—Unverified	0
Retrieval or Global Context Understanding? On Many-Shot In-Context Learning for Long-Context Evaluation	Nov 11, 2024	16kBenchmarking	CodeCode Available	0

Show:10 25 50

← PrevPage 256 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified