SOTAVerified

Decision Making

Papers

Showing 376400 of 12311 papers

TitleStatusHype
LawInstruct: A Resource for Studying Language Model Adaptation to the Legal DomainCode1
OOSTraj: Out-of-Sight Trajectory Prediction With Vision-Positioning DenoisingCode1
LITE: Modeling Environmental Ecosystems with Multimodal Large Language ModelsCode1
Language Models are Spacecraft OperatorsCode1
Linguistic Calibration of Long-Form GenerationsCode1
Decision Mamba: Reinforcement Learning via Sequence Modeling with Selective State SpacesCode1
Optimization-based Prompt Injection Attack to LLM-as-a-JudgeCode1
MMIDR: Teaching Large Language Model to Interpret Multimodal Misinformation via Knowledge DistillationCode1
Towards Learning Contrast Kinetics with Multi-Condition Latent Diffusion ModelsCode1
Uncertainty quantification for data-driven weather modelsCode1
Probabilistic Calibration by Design for Neural Network RegressionCode1
LLM Guided Evolution - The Automation of Models Advancing ModelsCode1
Driving Style Alignment for LLM-powered Driver AgentCode1
Beyond Pixels: Enhancing LIME with Hierarchical Features and Segmentation Foundation ModelsCode1
Reinforced Sequential Decision-Making for Sepsis Treatment: The POSNEGDM Framework with Mortality Classifier and TransformerCode1
Tapilot-Crossing: Benchmarking and Evolving LLMs Towards Interactive Data Analysis AgentsCode1
Human vs. Machine: Behavioral Differences Between Expert Humans and Language Models in Wargame SimulationsCode1
MiKASA: Multi-Key-Anchor & Scene-Aware Transformer for 3D Visual GroundingCode1
AgentsCourt: Building Judicial Decision-Making Agents with Court Debate Simulation and Legal Knowledge AugmentationCode1
ComTraQ-MPC: Meta-Trained DQN-MPC Integration for Trajectory Tracking with Limited Active Localization UpdatesCode1
Playing NetHack with LLMs: Potential & Limitations as Zero-Shot AgentsCode1
MemoNav: Working Memory Model for Visual NavigationCode1
Large Language Models are Learnable Planners for Long-Term RecommendationCode1
Benchmarking Data Science AgentsCode1
How Can LLM Guide RL? A Value-Based ApproachCode1
Show:102550
← PrevPage 16 of 493Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1SRLAAverage Remaining Cycles6.4Unverified