SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4921–4930 of 5548 papers

Title	Date	Tasks	Status	Hype
Can LLMs perform structured graph reasoning?	Feb 2, 2024	BenchmarkingNavigate	CodeCode Available	0
Attention-based Class-Conditioned Alignment for Multi-Source Domain Adaptation of Object Detectors	Mar 14, 2024	BenchmarkingDomain Adaptation	CodeCode Available	0
Exploring Model-based Planning with Policy Networks	Jun 20, 2019	Benchmarkingmodel	CodeCode Available	0
Exploring Context Generalizability in Citywide Crowd Mobility Prediction: An Analytic Framework and Benchmark	Jun 30, 2021	BenchmarkingPrediction	CodeCode Available	0
Multimodal Multi-User Surface Recognition with the Kernel Two-Sample Test	Mar 8, 2023	BenchmarkingTime Series	CodeCode Available	0
Exploiting Out-of-Domain Parallel Data through Multilingual Transfer Learning for Low-Resource Neural Machine Translation	Jul 6, 2019	BenchmarkingDomain Adaptation	CodeCode Available	0
Zero-shot generation of synthetic neurosurgical data with large language models	Feb 13, 2025	BenchmarkingDe-identification	CodeCode Available	0
Benchmarking Pathology Foundation Models: Adaptation Strategies and Scenarios	Oct 21, 2024	BenchmarkingFew-Shot Learning	CodeCode Available	0
Three Revisits to Node-Level Graph Anomaly Detection: Outliers, Message Passing and Hyperbolic Neural Networks	Mar 6, 2024	Anomaly DetectionBenchmarking	CodeCode Available	0
Multiple Instance Learning: A Survey of Problem Characteristics and Applications	Dec 11, 2016	BenchmarkingDocument Classification	CodeCode Available	0

Show:10 25 50

← PrevPage 493 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified