SOTAVerified

Decision Making

Papers

Showing 281290 of 12311 papers

TitleStatusHype
Adversarial Testing in LLMs: Insights into Decision-Making Vulnerabilities0
Multi-Armed Bandits Meet Large Language Models0
SNAPE-PM: Building and Utilizing Dynamic Partner Models for Adaptive Explanation GenerationCode0
RAGXplain: From Explainable Evaluation to Actionable Guidance of RAG Pipelines0
Of Mice and Machines: A Comparison of Learning Between Real World Mice and RL Agents0
Learning to Play Like Humans: A Framework for LLM Adaptation in Interactive Fiction Games0
Space evaluation at the starting point of soccer transitions0
Emotion Recognition for Low-Resource Turkish: Fine-Tuning BERTurk on TREMO and Testing on Xenophobic Political Discourse0
Solver-Informed RL: Grounding Large Language Models for Authentic Optimization Modeling0
Improving Fairness in LLMs Through Testing-Time Adversaries0
Show:102550
← PrevPage 29 of 1232Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1SRLAAverage Remaining Cycles6.4Unverified