SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 2351–2360 of 5548 papers

Title	Date	Tasks	Status	Hype	Score
Good at captioning, bad at counting: Benchmarking GPT-4V on Earth observation data	Jan 31, 2024	BenchmarkingChange Detection	CodeCode Available	0	5
Graph Convolutional Networks Meet with High Dimensionality Reduction	Nov 7, 2019	BenchmarkingDimensionality Reduction	CodeCode Available	0	5
GiantHunter: Accurate detection of giant virus in metagenomic data using reinforcement-learning and Monte Carlo tree search	Jan 26, 2025	BenchmarkingDiversity	CodeCode Available	0	5
Improve Machine Learning carbon footprint using Parquet dataset format and Mixed Precision training for regression models -- Part II	Sep 17, 2024	BenchmarkingDescriptive	CodeCode Available	0	5
Global Prediction of COVID-19 Variant Emergence Using Dynamics-Informed Graph Neural Networks	Jan 7, 2024	BenchmarkingGraph Neural Network	CodeCode Available	0	5
Benchmarking LLM-based Relevance Judgment Methods	Apr 17, 2025	BenchmarkingOpen-Domain Question Answering	CodeCode Available	0	5
Enhancing Treatment Effect Estimation via Active Learning: A Counterfactual Covering Perspective	May 8, 2025	Active LearningBenchmarking	CodeCode Available	0	5
Improving Pretrained Models for Zero-shot Multi-label Text Classification through Reinforced Label Hierarchy Reasoning	Apr 4, 2021	BenchmarkingMulti Label Text Classification	CodeCode Available	0	5
Geological Inference from Textual Data using Word Embeddings	Apr 10, 2025	BenchmarkingWord Embeddings	CodeCode Available	0	5
DyKgChat: Benchmarking Dialogue Generation Grounding on Dynamic Knowledge Graphs	Oct 1, 2019	BenchmarkingDialogue Generation	CodeCode Available	0	5

Show:10 25 50

← PrevPage 236 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified