SOTAVerified

Decision Making

Papers

Showing 101125 of 12311 papers

TitleStatusHype
DrivingSphere: Building a High-fidelity 4D World for Closed-loop SimulationCode2
Graph Neural Network Surrogates to leverage Mechanistic Expert Knowledge towards Reliable and Immediate Pandemic ResponseCode2
Concept Bottleneck Language Models For protein designCode2
LLM-PySC2: Starcraft II learning environment for Large Language ModelsCode2
AdaSociety: An Adaptive Environment with Social Structures for Multi-Agent Decision-MakingCode2
A Survey of Financial AI: Architectures, Advances and Open ChallengesCode2
ActiveSplat: High-Fidelity Scene Reconstruction through Active Gaussian SplattingCode2
Context is Key: A Benchmark for Forecasting with Essential Textual InformationCode2
Literature Meets Data: A Synergistic Approach to Hypothesis GenerationCode2
Improving Causal Reasoning in Large Language Models: A SurveyCode2
A New Approach to Solving SMAC Task: Generating Decision Tree Code from Large Language ModelsCode2
Local Off-Grid Weather Forecasting with Multi-Modal Earth Observation DataCode2
Process Reward Model with Q-Value RankingsCode2
ForecastBench: A Dynamic Benchmark of AI Forecasting CapabilitiesCode2
Towards Interactive and Learnable Cooperative Driving Automation: a Large Language Model-Driven Decision-Making FrameworkCode2
Strategist: Learning Strategic Skills by LLMs via Bi-Level Tree SearchCode2
MOMAland: A Set of Benchmarks for Multi-Objective Multi-Agent Reinforcement LearningCode2
CoD, Towards an Interpretable Medical Agent using Chain of DiagnosisCode2
UrbanWorld: An Urban World Model for 3D City GenerationCode2
FinCon: A Synthesized LLM Multi-Agent System with Conceptual Verbal Reinforcement for Enhanced Financial Decision MakingCode2
AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM AgentsCode2
ChartGemma: Visual Instruction-tuning for Chart Reasoning in the WildCode2
MacroHFT: Memory Augmented Context-aware Reinforcement Learning On High Frequency TradingCode2
PlanRAG: A Plan-then-Retrieval Augmented Generation for Generative Large Language Models as Decision MakersCode2
CleanDiffuser: An Easy-to-use Modularized Library for Diffusion Models in Decision MakingCode2
Show:102550
← PrevPage 5 of 493Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1SRLAAverage Remaining Cycles6.4Unverified