SOTAVerified

Decision Making

Papers

Showing 5160 of 12311 papers

TitleStatusHype
FlashDepth: Real-time Streaming Video Depth Estimation at 2K ResolutionCode3
Automated Hypothesis Validation with Agentic Sequential FalsificationsCode3
AuctionNet: A Novel Benchmark for Decision-Making in Large-Scale GamesCode3
Evaluating Language Model Agency through NegotiationsCode3
Automatic Gradient Estimation for Calibrating Crowd Models with Discrete Decision MakingCode3
Attention is not not ExplanationCode3
Evolve Cost-aware Acquisition Functions Using Large Language ModelsCode3
Game-theoretic LLM: Agent Workflow for Negotiation GamesCode3
Enhancing Decision Analysis with a Large Language Model: pyDecision a Comprehensive Library of MCDA Methods in PythonCode3
Embodied CoT Distillation From LLM To Off-the-shelf AgentsCode3
Show:102550
← PrevPage 6 of 1232Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1SRLAAverage Remaining Cycles6.4Unverified