SOTAVerified|Agents Browse Leaderboard About

Decision Making

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 491–500 of 12311 papers

Title	Date	Tasks	Status	Hype
Explainable Claim Verification via Knowledge-Grounded Reasoning with Large Language Models	Oct 8, 2023	Claim VerificationDecision Making	CodeCode Available	1
AvalonBench: Evaluating LLMs Playing the Game of Avalon	Oct 8, 2023	Decision Making	CodeCode Available	1
Deep Learning for Two-Stage Robust Integer Optimization	Oct 6, 2023	Decision MakingDeep Learning	CodeCode Available	1
Beyond Uniform Sampling: Offline Reinforcement Learning with Imbalanced Datasets	Oct 6, 2023	D4RLDecision Making	CodeCode Available	1
Large Language Model Cascades with Mixture of Thoughts Representations for Cost-efficient Reasoning	Oct 4, 2023	Decision MakingLanguage Modeling	CodeCode Available	1
MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use	Oct 4, 2023	Decision Making	CodeCode Available	1
Trainable Noise Model as an XAI evaluation method: application on Sobol for remote sensing image segmentation	Oct 3, 2023	Autonomous DrivingDecision Making	CodeCode Available	1
Towards Robust Fidelity for Evaluating Explainability of Graph Neural Networks	Oct 3, 2023	Decision Making	CodeCode Available	1
Talk2BEV: Language-enhanced Bird's-eye View Maps for Autonomous Driving	Oct 3, 2023	Autonomous DrivingDecision Making	CodeCode Available	1
Mini-BEHAVIOR: A Procedurally Generated Benchmark for Long-horizon Decision-Making in Embodied AI	Oct 3, 2023	Decision Making	CodeCode Available	1

Show:10 25 50

← PrevPage 50 of 1232Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	SRLA	Average Remaining Cycles	6.4	—	Unverified