SOTAVerified|Agents Browse Leaderboard About

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 701–710 of 5548 papers

Title	Date	Tasks	Status	Hype
Chaos as an interpretable benchmark for forecasting and data-driven modelling	Oct 11, 2021	BenchmarkingSymbolic Regression	CodeCode Available	1
AllClear: A Comprehensive Dataset and Benchmark for Cloud Removal in Satellite Imagery	Oct 31, 2024	BenchmarkingCloud Removal	CodeCode Available	1
CCTV-Gun: Benchmarking Handgun Detection in CCTV Images	Mar 19, 2023	Benchmarkingobject-detection	CodeCode Available	1
A Comprehensive Study on Large-Scale Graph Training: Benchmarking and Rethinking	Oct 14, 2022	BenchmarkingGPU	CodeCode Available	1
Automatic sleep stage classification with deep residual networks in a mixed-cohort setting	Aug 21, 2020	Automatic Sleep Stage ClassificationBenchmarking	CodeCode Available	1
Cross-Modal Bidirectional Interaction Model for Referring Remote Sensing Image Segmentation	Oct 11, 2024	BenchmarkingImage Segmentation	CodeCode Available	1
CSAW-M: An Ordinal Classification Dataset for Benchmarking Mammographic Masking of Cancer	Dec 2, 2021	BenchmarkingOrdinal Classification	CodeCode Available	1
Towards Reliable Detection of LLM-Generated Texts: A Comprehensive Evaluation Framework with CUDRT	Jun 13, 2024	BenchmarkingLLM-generated Text Detection	CodeCode Available	1
CharacterBench: Benchmarking Character Customization of Large Language Models	Dec 16, 2024	Benchmarking	CodeCode Available	1
A Ladder of Causal Distances	May 5, 2020	BenchmarkingCausal Discovery	CodeCode Available	1

Show:10 25 50

← PrevPage 71 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified