SOTAVerified

Decision Making

Papers

Showing 151200 of 12311 papers

TitleStatusHype
Global birdsong embeddings enable superior transfer learning for bioacoustic classificationCode2
Embodied LLM Agents Learn to Cooperate in Organized TeamsCode2
Enhancing Autonomous Driving Systems with On-Board Deployed Large Language ModelsCode2
Astock: A New Dataset and Automated Stock Trading based on Stock-specific News Analyzing ModelCode2
AdaSociety: An Adaptive Environment with Social Structures for Multi-Agent Decision-MakingCode2
Expandable Subspace Ensemble for Pre-Trained Model-Based Class-Incremental LearningCode2
ADAPT: Action-aware Driving Caption TransformerCode2
Distributional Soft Actor-Critic with Three RefinementsCode2
Dungeons and Data: A Large-Scale NetHack DatasetCode2
AdaFlow: Imitation Learning with Variance-Adaptive Flow-Based PoliciesCode2
DrivingSphere: Building a High-fidelity 4D World for Closed-loop SimulationCode2
Driving with LLMs: Fusing Object-Level Vector Modality for Explainable Autonomous DrivingCode2
FinCon: A Synthesized LLM Multi-Agent System with Conceptual Verbal Reinforcement for Enhanced Financial Decision MakingCode2
AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM AgentsCode2
AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language ModelsCode2
A Review of Safe Reinforcement Learning: Methods, Theory and ApplicationsCode2
Distribution-Free, Risk-Controlling Prediction SetsCode2
DiLu: A Knowledge-Driven Approach to Autonomous Driving with Large Language ModelsCode2
ActiveSplat: High-Fidelity Scene Reconstruction through Active Gaussian SplattingCode2
Disentangling Memory and Reasoning Ability in Large Language ModelsCode2
Divide and Conquer: Grounding LLMs as Efficient Decision-Making Agents via Offline Hierarchical Reinforcement LearningCode2
Grounding Large Language Models in Interactive Environments with Online Reinforcement LearningCode2
Diffusion Actor-Critic with Entropy RegulatorCode2
Hierarchical Expert Prompt for Large-Language-Model: An Approach Defeat Elite AI in TextStarCraft II for the First TimeCode2
Deep Generative Models for Offline Policy Learning: Tutorial, Survey, and Perspectives on Future DirectionsCode2
Revocable Deep Reinforcement Learning with Affinity Regularization for Outlier-Robust Graph MatchingCode2
iVideoGPT: Interactive VideoGPTs are Scalable World ModelsCode2
Jack of All Trades, Master of Some, a Multi-Purpose Transformer AgentCode2
Digital Player: Evaluating Large Language Models based Human-like Agent in GamesCode2
Do As I Can, Not As I Say: Grounding Language in Robotic AffordancesCode2
A New Approach to Solving SMAC Task: Generating Decision Tree Code from Large Language ModelsCode2
LatteReview: A Multi-Agent Framework for Systematic Review Automation Using Large Language ModelsCode2
Cross-Prediction-Powered InferenceCode2
Cumulative Reasoning with Large Language ModelsCode2
Cooperate or Collapse: Emergence of Sustainable Cooperation in a Society of LLM AgentsCode2
LLM-PySC2: Starcraft II learning environment for Large Language ModelsCode2
Continuously Learning, Adapting, and Improving: A Dual-Process Approach to Autonomous DrivingCode2
Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPRCode2
Agentic Knowledgeable Self-awarenessCode2
LVBench: An Extreme Long Video Understanding BenchmarkCode2
DecisionNCE: Embodied Multimodal Representations via Implicit Preference LearningCode2
Concept Bottleneck Language Models For protein designCode2
MindMap: Knowledge Graph Prompting Sparks Graph of Thoughts in Large Language ModelsCode2
CombatVLA: An Efficient Vision-Language-Action Model for Combat Tasks in 3D Action Role-Playing GamesCode2
CoD, Towards an Interpretable Medical Agent using Chain of DiagnosisCode2
MACRec: a Multi-Agent Collaboration Framework for RecommendationCode2
Multi-Agent Reinforcement Learning for Resources Allocation Optimization: A SurveyCode2
BOLAA: Benchmarking and Orchestrating LLM-augmented Autonomous AgentsCode2
Natural Language Reinforcement LearningCode2
A Comprehensive Guide to Explainable AI: From Classical Models to LLMsCode2
Show:102550
← PrevPage 4 of 247Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1SRLAAverage Remaining Cycles6.4Unverified