SOTAVerified|Agents Browse Leaderboard About

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 901–910 of 5548 papers

Title	Date	Tasks	Status	Hype
An Image Dataset for Benchmarking Recommender Systems with Raw Pixels	Sep 13, 2023	BenchmarkingRecommendation Systems	CodeCode Available	1
Labelling unlabelled videos from scratch with multi-modal self-supervision	Jun 24, 2020	BenchmarkingClustering	CodeCode Available	1
A Comprehensive Benchmark for COVID-19 Predictive Modeling Using Electronic Health Records in Intensive Care	Sep 16, 2022	BenchmarkingDeep Learning	CodeCode Available	1
AIGV-Assessor: Benchmarking and Evaluating the Perceptual Quality of Text-to-Video Generation with LMM	Nov 26, 2024	BenchmarkingText-to-Video Generation	CodeCode Available	1
AD-LLM: Benchmarking Large Language Models for Anomaly Detection	Dec 15, 2024	Anomaly DetectionBenchmarking	CodeCode Available	1
ClimART: A Benchmark Dataset for Emulating Atmospheric Radiative Transfer in Weather and Climate Models	Nov 29, 2021	BenchmarkingPhysical Simulations	CodeCode Available	1
CIPCaD-Bench: Continuous Industrial Process datasets for benchmarking Causal Discovery methods	Aug 2, 2022	BenchmarkingCausal Discovery	CodeCode Available	1
Benchmarking Counterfactual Image Generation	Mar 29, 2024	BenchmarkingConditional Image Generation	CodeCode Available	1
AdsorbML: A Leap in Efficiency for Adsorption Energy Calculations using Generalizable Machine Learning Potentials	Nov 29, 2022	Benchmarking	CodeCode Available	1
CIDEr: Consensus-based Image Description Evaluation	Nov 20, 2014	Action RecognitionAttribute	CodeCode Available	1

Show:10 25 50

← PrevPage 91 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified