SOTAVerified

Decision Making

Papers

Showing 901950 of 12311 papers

TitleStatusHype
Med-gte-hybrid: A contextual embedding transformer model for extracting actionable information from clinical texts0
Graph Attention Convolutional U-NET: A Semantic Segmentation Model for Identifying Flooded Areas0
Exploring Embodied Multimodal Large Models: Development, Datasets, and Future Directions0
A Knowledge Distillation-Based Approach to Enhance Transparency of Classifier ModelsCode0
Alignment, Agency and Autonomy in Frontier AI: A Systems Engineering Perspective0
An Interpretable Machine Learning Approach to Understanding the Relationships between Solar Flares and Source Active Regions0
Multi-Objective Causal Bayesian OptimizationCode1
SPRIG: Stackelberg Perception-Reinforcement Learning with Internal Game Dynamics0
Reinforcement Learning for Ultrasound Image Analysis A Comprehensive Review of Advances and Applications0
PC-Agent: A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PCCode9
Investigating the Impact of LLM Personality on Cognitive Bias Manifestation in Automated Decision-Making Tasks0
The Impact and Feasibility of Self-Confidence Shaping for AI-Assisted Decision-Making0
Beyond Self-Talk: A Communication-Centric Survey of LLM-Based Multi-Agent Systems0
STeCa: Step-level Trajectory Calibration for LLM Agent LearningCode1
Mem2Ego: Empowering Vision-Language Models with Global-to-Ego Memory for Long-Horizon Embodied Navigation0
How Far are LLMs from Being Our Digital Twins? A Benchmark for Persona-Based Behavior Chain SimulationCode1
MedHallu: A Comprehensive Benchmark for Detecting Medical Hallucinations in Large Language Models0
Human Misperception of Generative-AI Alignment: A Laboratory Experiment0
Online detection of forecast model inadequacies using forecast errors0
Black Sheep in the Herd: Playing with Spuriously Correlated Attributes for Vision-Language Recognition0
Playing Hex and Counter Wargames using Reinforcement Learning and Recurrent Neural NetworksCode0
Human-Artificial Interaction in the Age of Agentic AI: A System-Theoretical Approach0
LLM-Enhanced Dialogue Management for Full-Duplex Spoken Dialogue Systems0
Benchmarking LLMs for Political Science: A United Nations PerspectiveCode1
Why Safeguarded Ships Run Aground? Aligned Large Language Models' Safety Mechanisms Tend to Be Anchored in The Template Region0
RobustX: Robust Counterfactual Explanations Made EasyCode1
AgentCF++: Memory-enhanced LLM-based Agents for Popularity-aware Cross-domain RecommendationsCode0
Fighter Jet Navigation and Combat using Deep Reinforcement Learning with Explainable AICode0
AdaptiveStep: Automatically Dividing Reasoning Step through Model ConfidenceCode1
RGAR: Recurrence Generation-augmented Retrieval for Factual-aware Medical Question Answering0
MCTS-KBQA: Monte Carlo Tree Search for Knowledge Base Question Answering0
Fraud-R1 : A Multi-Round Benchmark for Assessing the Robustness of LLM Against Augmented Fraud and Phishing InducementsCode1
LLM Trading: Analysis of LLM Agent Behavior in Experimental Asset Markets0
MindLLM: A Subject-Agnostic and Versatile Model for fMRI-to-Text Decoding0
Should I Trust You? Detecting Deception in Negotiations using Counterfactual RL0
AEIA-MN: Evaluating the Robustness of Multimodal LLM-Powered Mobile Agents Against Active Environmental Injection Attacks0
Adaptive Tool Use in Large Language Models with Meta-Cognition Trigger0
Value Gradient Sampler: Sampling as Sequential Decision MakingCode0
Capturing Human Cognitive Styles with Language: Towards an Experimental Evaluation Paradigm0
Towards Robust and Secure Embodied AI: A Survey on Vulnerabilities and Attacks0
Conditional Max-Sum for Asynchronous Multiagent Decision Making0
Adjust for Trust: Mitigating Trust-Induced Inappropriate Reliance on AI Assistance0
AI-Assisted Decision Making with Human Learning0
Addressing Moral Uncertainty using Large Language Models for Ethical Decision-Making0
One for All: A General Framework of LLMs-based Multi-Criteria Decision Making on Human Expert Level0
Unveiling Privacy Risks in LLM Agent Memory0
Human-centered explanation does not fit all: The interplay of sociotechnical, cognitive, and individual factors in the effect AI explanations in algorithmic decision-making0
Scaling Autonomous Agents via Automatic Reward Modeling And Planning0
ExaGPT: Example-Based Machine-Generated Text Detection for Human Interpretability0
QoS based resource management for concurrent operation using MCTS0
Show:102550
← PrevPage 19 of 247Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1SRLAAverage Remaining Cycles6.4Unverified