SOTAVerified

Decision Making

Papers

Showing 5175 of 12311 papers

TitleStatusHype
A Demonstration of Adaptive Collaboration of Large Language Models for Medical Decision-MakingCode3
Embodied Agent Interface: Benchmarking LLMs for Embodied Decision MakingCode3
Sentiment Reasoning for HealthcareCode3
Reinforcement Learning Meets Visual OdometryCode3
ACEGEN: Reinforcement learning of generative chemical agents for drug discoveryCode3
Evolve Cost-aware Acquisition Functions Using Large Language ModelsCode3
MDAgents: An Adaptive Collaboration of LLMs for Medical Decision-MakingCode3
Enhancing Decision Analysis with a Large Language Model: pyDecision a Comprehensive Library of MCDA Methods in PythonCode3
Automatic Gradient Estimation for Calibrating Crowd Models with Discrete Decision MakingCode3
Behavior Generation with Latent ActionsCode3
Beyond A*: Better Planning with Transformers via Search Dynamics BootstrappingCode3
UniST: A Prompt-Empowered Universal Model for Urban Spatio-Temporal PredictionCode3
SPO: Sequential Monte Carlo Policy OptimisationCode3
V-IRL: Grounding Virtual Intelligence in Real LifeCode3
PokeLLMon: A Human-Parity Agent for Pokemon Battles with Large Language ModelsCode3
Evaluating Language Model Agency through NegotiationsCode3
LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot LearningCode3
Hierarchical Prompting Assists Large Language Model on Web NavigationCode3
Planning with Diffusion for Flexible Behavior SynthesisCode3
Attention is not not ExplanationCode3
NavMorph: A Self-Evolving World Model for Vision-and-Language Navigation in Continuous EnvironmentsCode2
CausalPFN: Amortized Causal Effect Estimation via In-Context LearningCode2
Divide and Conquer: Grounding LLMs as Efficient Decision-Making Agents via Offline Hierarchical Reinforcement LearningCode2
Multi-Agent Reinforcement Learning for Resources Allocation Optimization: A SurveyCode2
Enhancing Autonomous Driving Systems with On-Board Deployed Large Language ModelsCode2
Show:102550
← PrevPage 3 of 493Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1SRLAAverage Remaining Cycles6.4Unverified