SOTAVerified|Agents Browse Leaderboard About

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1101–1110 of 5548 papers

Title	Date	Tasks	Status	Hype
A Comprehensive Benchmark for COVID-19 Predictive Modeling Using Electronic Health Records in Intensive Care	Sep 16, 2022	BenchmarkingDeep Learning	CodeCode Available	1
ScreenQA: Large-Scale Question-Answer Pairs over Mobile App Screenshots	Sep 16, 2022	BenchmarkingQuestion Answering	CodeCode Available	1
Benchmarking Multimodal Variational Autoencoders: CdSprites+ Dataset and Toolkit	Sep 7, 2022	Benchmarking	CodeCode Available	1
Structural Bias for Aspect Sentiment Triplet Extraction	Sep 2, 2022	Aspect Sentiment Triplet ExtractionBenchmarking	CodeCode Available	1
nnOOD: A Framework for Benchmarking Self-supervised Anomaly Localisation Methods	Sep 2, 2022	Anomaly DetectionBenchmarking	CodeCode Available	1
Benchmarking Compositionality with Formal Languages	Aug 17, 2022	BenchmarkingOpen-Ended Question Answering	CodeCode Available	1
CIPCaD-Bench: Continuous Industrial Process datasets for benchmarking Causal Discovery methods	Aug 2, 2022	BenchmarkingCausal Discovery	CodeCode Available	1
A Multifaceted Benchmarking of Synthetic Electronic Health Record Generation Models	Aug 2, 2022	BenchmarkingSynthetic Data Generation	CodeCode Available	1
Accelerated and interpretable oblique random survival forests	Aug 1, 2022	BenchmarkingComputational Efficiency	CodeCode Available	1
Tracking Every Thing in the Wild	Jul 26, 2022	BenchmarkingClassification	CodeCode Available	1

Show:10 25 50

← PrevPage 111 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified