SOTAVerified

Decision Making

Papers

Showing 101150 of 12311 papers

TitleStatusHype
Towards Responsible AI: Advances in Safety, Fairness, and Accountability of Autonomous Systems0
The Sample Complexity of Online Strategic Decision Making with Information Asymmetry and Knowledge Transportability0
FinHEAR: Human Expertise and Adaptive Risk-Aware Temporal Reasoning for Financial Decision-Making0
Spiking Neural Models for Decision-Making Tasks with LearningCode0
Did I Faithfully Say What I Thought? Bridging the Gap Between Neural Activity and Self-Explanations in Large Language Models0
Understanding Software Engineering Agents Through the Lens of Traceability: An Empirical Study0
Measuring Data Science Automation: A Survey of Evaluation Tools for AI Assistants and Agents0
Unlocking the Potential of Large Language Models in the Nuclear Industry with Synthetic Data0
Real-Time Cascade Mitigation in Power Systems Using Influence Graph Improved by Reinforcement Learning0
Bayesian Inverse Physics for Neuro-Symbolic Robot Learning0
How to Provably Improve Return Conditioned Supervised Learning?0
Re4MPC: Reactive Nonlinear MPC for Multi-model Motion Planning via Deep Reinforcement LearningCode1
HGFormer: A Hierarchical Graph Transformer Framework for Two-Stage Colonel Blotto Games via Reinforcement Learning0
Diffusion of Responsibility in Collective Decision Making0
SurgBench: A Unified Large-Scale Benchmark for Surgical Video Analysis0
REMoH: A Reflective Evolution of Multi-objective Heuristics approach via Large Language Models0
A Unified Anti-Jamming Design in Complex Environments Based on Cross-Modal Fusion and Intelligent Decision-Making0
LUCIFER: Language Understanding and Context-Infused Framework for Exploration and Behavior Refinement0
Accelerating Diffusion Models in Offline RL via Reward-Aware Consistency Trajectory Distillation0
Improving Fairness of Large Language Models in Multi-document SummarizationCode0
Accelerating Spectral Clustering under Fairness Constraints0
Benchmarking Pre-Trained Time Series Models for Electricity Price Forecasting0
CausalPFN: Amortized Causal Effect Estimation via In-Context LearningCode2
A Narrative Review on Large AI Models in Lung Cancer Screening, Diagnosis, and Treatment Planning0
QForce-RL: Quantized FPGA-Optimized Reinforcement Learning Compute Engine0
Contextual Experience Replay for Self-Improvement of Language Agents0
QuantMCP: Grounding Large Language Models in Verifiable Financial Reality0
Prompting Wireless Networks: Reinforced In-Context Learning for Power Control0
Object Navigation with Structure-Semantic Reasoning-Based Multi-level Map and Multimodal Decision-Making LLM0
SurGSplat: Progressive Geometry-Constrained Gaussian Splatting for Surgical Scene Reconstruction0
Structured Labeling Enables Faster Vision-Language Models for End-to-End Autonomous Driving0
Natural Language Interaction with Databases on Edge Devices in the Internet of Battlefield Things0
Empowering Economic Simulation for Massively Multiplayer Online Games through Generative Agent-Based Modeling0
Look Before You Leap: A GUI-Critic-R1 Model for Pre-Operative Error Diagnosis in GUI Automation0
Ignoring Directionality Leads to Compromised Graph Neural Network Explanations0
Artificial Intelligence Should Genuinely Support Clinical Reasoning and Decision Making To Bridge the Translational Gap0
Impact of Hill coefficient and time delay on a perceptual decision-making model0
AD-EE: Early Exiting for Fast and Reliable Vision-Language Models in Autonomous Driving0
Conformal Mixed-Integer Constraint Learning with Feasibility Guarantees0
CLAIM: An Intent-Driven Multi-Agent Framework for Analyzing Manipulation in Courtroom DialoguesCode0
An AI-Based Public Health Data Monitoring System0
OWMM-Agent: Open World Mobile Manipulation With Multi-modal Agentic Data SynthesisCode1
Finding signatures of low-dimensional geometric landscapes in high-dimensional cell fate transitionsCode0
FPGA-Enabled Machine Learning Applications in Earth Observation: A Systematic ReviewCode0
TextAtari: 100K Frames Game Playing with Language AgentsCode0
VS-Bench: Evaluating VLMs for Strategic Reasoning and Decision-Making in Multi-Agent Environments0
Joint Modeling for Learning Decision-Making Dynamics in Behavioral Experiments0
A Smart Multimodal Healthcare Copilot with Powerful LLM ReasoningCode3
Improving Performance of Spike-based Deep Q-Learning using Ternary Neurons0
Designing Algorithmic Delegates: The Role of Indistinguishability in Human-AI Handoff0
Show:102550
← PrevPage 3 of 247Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1SRLAAverage Remaining Cycles6.4Unverified