SOTAVerified|Agents Browse Leaderboard About Blog

Benchmarking

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 4181–4190 of 5548 papers

Title	Date	Tasks	Status	Hype
Towards Benchmarking and Assessing the Safety and Robustness of Autonomous Driving on Safety-critical Scenarios	Mar 31, 2025	Adversarial AttackAutonomous Driving	—Unverified	0
Towards Benchmarking and Evaluating Deepfake Detection	Mar 4, 2022	BenchmarkingDeepFake Detection	—Unverified	0
Towards Benchmarking Explainable Artificial Intelligence Methods	Aug 25, 2022	BenchmarkingExplainable artificial intelligence	—Unverified	0
Towards Benchmarking Scene Background Initialization	Jun 12, 2015	Benchmarking	—Unverified	0
Towards Benchmarking the Utility of Explanations for Model Debugging	May 10, 2021	Benchmarking	—Unverified	0
Towards Class-agnostic Tracking Using Feature Decorrelation in Point Clouds	Feb 28, 2022	BenchmarkingObject Tracking	—Unverified	0
Towards Effective Disambiguation for Machine Translation with Large Language Models	Sep 20, 2023	BenchmarkingIn-Context Learning	—Unverified	0
Towards Efficient Multi-LLM Inference: Characterization and Analysis of LLM Routing and Hierarchical Techniques	Jun 6, 2025	BenchmarkingModel Selection	—Unverified	0
Towards Explainability and Fairness in Swiss Judgement Prediction: Benchmarking on a Multilingual Dataset	Feb 26, 2024	BenchmarkingCross-Lingual Transfer	—Unverified	0
Towards Explainable Network Intrusion Detection using Large Language Models	Aug 8, 2024	BenchmarkingIntrusion Detection	—Unverified	0

Show:10 25 50

← PrevPage 419 of 555Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GPT-4 Turbo	ACC	0.56	—	Unverified