SOTAVerified

Decision Making

Papers

Showing 401410 of 12311 papers

TitleStatusHype
EHRNoteQA: An LLM Benchmark for Real-World Clinical Practice Using Discharge SummariesCode1
XRL-Bench: A Benchmark for Evaluating and Comparing Explainable Reinforcement Learning TechniquesCode1
Reflect-RL: Two-Player Online RL Fine-Tuning for LMsCode1
Dynamic planning in hierarchical active inferenceCode1
PRISE: LLM-Style Sequence Compression for Learning Temporal Action Abstractions in ControlCode1
Explaining generative diffusion models via visual analysis for interpretable decision-making processCode1
Uncertainty Quantification for Forward and Inverse Problems of PDEs via Latent Global EvolutionCode1
A RAG-Based Multi-Agent LLM System for Natural Hazard Resilience and AdaptationCode1
Addressing cognitive bias in medical language modelsCode1
TELLER: A Trustworthy Framework for Explainable, Generalizable and Controllable Fake News DetectionCode1
Show:102550
← PrevPage 41 of 1232Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1SRLAAverage Remaining Cycles6.4Unverified