SOTAVerified

Decision Making

Papers

Showing 4150 of 12311 papers

TitleStatusHype
Beyond Quacking: Deep Integration of Language Models and RAG into DuckDBCode3
Beyond A*: Better Planning with Transformers via Search Dynamics BootstrappingCode3
Game-theoretic LLM: Agent Workflow for Negotiation GamesCode3
Hierarchical Prompting Assists Large Language Model on Web NavigationCode3
Auto-RAG: Autonomous Retrieval-Augmented Generation for Large Language ModelsCode3
Automatic Gradient Estimation for Calibrating Crowd Models with Discrete Decision MakingCode3
Evolve Cost-aware Acquisition Functions Using Large Language ModelsCode3
Automated Hypothesis Validation with Agentic Sequential FalsificationsCode3
Evaluating Language Model Agency through NegotiationsCode3
Attention is not not ExplanationCode3
Show:102550
← PrevPage 5 of 1232Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1SRLAAverage Remaining Cycles6.4Unverified