SOTAVerified

counterfactual

Papers

Showing 125 of 2765 papers

TitleStatusHype
Length-Controlled AlpacaEval: A Simple Way to Debias Automatic EvaluatorsCode5
Beyond Reward Hacking: Causal Rewards for Large Language Model AlignmentCode4
On the limits of agency in agent-based modelsCode4
OmniDrive: A Holistic Vision-Language Dataset for Autonomous Driving with Counterfactual ReasoningCode4
An Actionable Framework for Assessing Bias and Fairness in Large Language Model Use CasesCode3
Anatomically-Controllable Medical Image Generation with Segmentation-Guided Diffusion ModelsCode3
Sparse Autoencoders Find Highly Interpretable Features in Language ModelsCode3
Towards CausalGPT: A Multi-Agent Approach for Faithful Knowledge Reasoning via Promoting Causal Consistency in LLMsCode3
Locating and Editing Factual Associations in GPTCode3
Difference-in-Differences Estimation with Spatial SpilloversCode3
Thought Anchors: Which LLM Reasoning Steps Matter?Code2
CausalVQA: A Physically Grounded Causal Reasoning Benchmark for Video ModelsCode2
Vision Language Models are BiasedCode2
Generalized Semantic Contrastive Learning via Embedding Side Information for Few-Shot Object DetectionCode2
OptiChat: Bridging Optimization Models and Practitioners with Large Language ModelsCode2
A Comprehensive Guide to Explainable AI: From Classical Models to LLMsCode2
Derivative-Free Diffusion Manifold-Constrained Gradient for Unified XAICode2
HourVideo: 1-Hour Video-Language UnderstandingCode2
Mitigating Modality Prior-Induced Hallucinations in Multimodal Large Language Models via Deciphering Attention CausalityCode2
FaithEval: Can Your Language Model Stay Faithful to Context, Even If "The Moon is Made of Marshmallows"Code2
SocialCircle+: Learning the Angle-based Conditioned Interaction Representation for Pedestrian Trajectory PredictionCode2
Exploring the Causality of End-to-End Autonomous DrivingCode2
Extended Mind TransformersCode2
Decomposing and Editing Predictions by Modeling Model ComputationCode2
Generative Enhancement for 3D Medical ImagesCode2
Show:102550
← PrevPage 1 of 111Next →

No leaderboard results yet.