SOTAVerified|Agents Browse Leaderboard About

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 961–970 of 5548 papers

Title	Date	Tasks	Status	Hype
Challenges and Opportunities in Improving Worst-Group Generalization in Presence of Spurious Features	Jun 21, 2023	BenchmarkingModel Selection	CodeCode Available	1
GADBench: Revisiting and Benchmarking Supervised Graph Anomaly Detection	Jun 21, 2023	Anomaly DetectionBenchmarking	CodeCode Available	1
Benchmarking and Analyzing 3D-aware Image Synthesis with a Modularized Codebase	Jun 21, 2023	3D-Aware Image SynthesisBenchmarking	CodeCode Available	1
IMP-MARL: a Suite of Environments for Large-scale Infrastructure Management Planning via MARL	Jun 20, 2023	BenchmarkingManagement	CodeCode Available	1
Geometric Deep Learning for Structure-Based Drug Design: A Survey	Jun 20, 2023	BenchmarkingDeep Learning	CodeCode Available	1
causalAssembly: Generating Realistic Production Data for Benchmarking Causal Discovery	Jun 19, 2023	BenchmarkingCausal Discovery	CodeCode Available	1
Beyond Normal: On the Evaluation of Mutual Information Estimators	Jun 19, 2023	BenchmarkingDomain Generalization	CodeCode Available	1
CompanyKG: A Large-Scale Heterogeneous Graph for Company Similarity Quantification	Jun 18, 2023	BenchmarkingRetrieval	CodeCode Available	1
OpenDataVal: a Unified Benchmark for Data Valuation	Jun 18, 2023	BenchmarkingData Valuation	CodeCode Available	1
Evaluating Graph Neural Networks for Link Prediction: Current Pitfalls and New Benchmarking	Jun 18, 2023	BenchmarkingLink Prediction	CodeCode Available	1

Show:10 25 50

← PrevPage 97 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified