SOTAVerified

Decision Making

Papers

Showing 126150 of 12311 papers

TitleStatusHype
AGIEval: A Human-Centric Benchmark for Evaluating Foundation ModelsCode2
AdaFlow: Imitation Learning with Variance-Adaptive Flow-Based PoliciesCode2
ActiveSplat: High-Fidelity Scene Reconstruction through Active Gaussian SplattingCode2
ADAPT: Action-aware Driving Caption TransformerCode2
Dungeons and Data: A Large-Scale NetHack DatasetCode2
Embodied LLM Agents Learn to Cooperate in Organized TeamsCode2
DrivingSphere: Building a High-fidelity 4D World for Closed-loop SimulationCode2
Doe-1: Closed-Loop Autonomous Driving with Large World ModelCode2
Driving with LLMs: Fusing Object-Level Vector Modality for Explainable Autonomous DrivingCode2
Disentangling Memory and Reasoning Ability in Large Language ModelsCode2
Diffusion Actor-Critic with Entropy RegulatorCode2
Distribution-Free, Risk-Controlling Prediction SetsCode2
Distributional Soft Actor-Critic with Three RefinementsCode2
Enhancing Autonomous Driving Systems with On-Board Deployed Large Language ModelsCode2
DecisionNCE: Embodied Multimodal Representations via Implicit Preference LearningCode2
A Comprehensive Guide to Explainable AI: From Classical Models to LLMsCode2
Cross-Prediction-Powered InferenceCode2
Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPRCode2
Cumulative Reasoning with Large Language ModelsCode2
Digital Player: Evaluating Large Language Models based Human-like Agent in GamesCode2
DiLu: A Knowledge-Driven Approach to Autonomous Driving with Large Language ModelsCode2
Divide and Conquer: Grounding LLMs as Efficient Decision-Making Agents via Offline Hierarchical Reinforcement LearningCode2
Do As I Can, Not As I Say: Grounding Language in Robotic AffordancesCode2
A New Approach to Solving SMAC Task: Generating Decision Tree Code from Large Language ModelsCode2
Deep Generative Models for Offline Policy Learning: Tutorial, Survey, and Perspectives on Future DirectionsCode2
Show:102550
← PrevPage 6 of 493Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1SRLAAverage Remaining Cycles6.4Unverified