SOTAVerified|Agents Browse Leaderboard About

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 411–420 of 5548 papers

Title	Date	Tasks	Status	Hype
Towards Motion Forecasting with Real-World Perception Inputs: Are End-to-End Approaches Competitive?	Jun 15, 2023	Autonomous DrivingAutonomous Vehicles	CodeCode Available	1
Chaos as an interpretable benchmark for forecasting and data-driven modelling	Oct 11, 2021	BenchmarkingSymbolic Regression	CodeCode Available	1
On the Detectability of ChatGPT Content: Benchmarking, Methodology, and Evaluation through the Lens of Academic Writing	Jun 7, 2023	BenchmarkingPrompt Engineering	CodeCode Available	1
CAVIAR: Co-simulation of 6G Communications, 3D Scenarios and AI for Digital Twins	Jan 6, 2024	Autonomous VehiclesBenchmarking	CodeCode Available	1
CBench: Towards Better Evaluation of Question Answering Over Knowledge Graphs	Apr 5, 2021	BenchmarkingKnowledge Graphs	CodeCode Available	1
Causality for Tabular Data Synthesis: A High-Order Structure Causal Benchmark Framework	Jun 12, 2024	BenchmarkingCausal Inference	CodeCode Available	1
CausalTime: Realistically Generated Time-series for Benchmarking of Causal Discovery	Oct 3, 2023	BenchmarkingCausal Discovery	CodeCode Available	1
CCTV-Gun: Benchmarking Handgun Detection in CCTV Images	Mar 19, 2023	Benchmarkingobject-detection	CodeCode Available	1
CARLA: A Python Library to Benchmark Algorithmic Recourse and Counterfactual Explanation Algorithms	Aug 2, 2021	Benchmarkingcounterfactual	CodeCode Available	1
Accelerated and interpretable oblique random survival forests	Aug 1, 2022	BenchmarkingComputational Efficiency	CodeCode Available	1

Show:10 25 50

← PrevPage 42 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified