SOTAVerified|Agents Browse Leaderboard About

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1471–1480 of 5548 papers

Title	Date	Tasks	Status	Hype	Score
Benchmarking Graph Neural Networks on Dynamic Link Prediction	Sep 29, 2021	BenchmarkingDynamic Link Prediction	CodeCode Available	1	5
Benchmarking Graph Neural Networks for FMRI analysis	Nov 16, 2022	Benchmarking	CodeCode Available	1	5
MatTools: Benchmarking Large Language Models for Materials Science Tools	May 16, 2025	BenchmarkingQuestion Answering	CodeCode Available	1	5
Boosting Healthcare LLMs Through Retrieved Context	Sep 23, 2024	BenchmarkingMultiple-choice	CodeCode Available	1	5
Beyond neural scaling laws: beating power law scaling via data pruning	Jun 29, 2022	Benchmarking	CodeCode Available	1	5
Safety-enhanced UAV Path Planning with Spherical Vector-based Particle Swarm Optimization	Apr 13, 2021	BenchmarkingMetaheuristic Optimization	CodeCode Available	1	5
Binary Code Summarization: Benchmarking ChatGPT/GPT-4 and Other Large Language Models	Dec 15, 2023	BenchmarkingCode Summarization	CodeCode Available	1	5
DependEval: Benchmarking LLMs for Repository Dependency Understanding	Mar 9, 2025	BenchmarkingCode Generation	CodeCode Available	1	5
Labelling unlabelled videos from scratch with multi-modal self-supervision	Jun 24, 2020	BenchmarkingClustering	CodeCode Available	1	5
LogLead -- Fast and Integrated Log Loader, Enhancer, and Anomaly Detector	Nov 20, 2023	Anomaly DetectionBenchmarking	CodeCode Available	1	5

Show:10 25 50

← PrevPage 148 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified