SOTAVerified

counterfactual

Papers

Showing 150 of 2765 papers

TitleStatusHype
Turning Sand to Gold: Recycling Data to Bridge On-Policy and Off-Policy Learning via Causal Bound0
HalluSegBench: Counterfactual Visual Reasoning for Segmentation Hallucination Evaluation0
Transformer-Based Spatial-Temporal Counterfactual Outcomes EstimationCode0
Unveiling Causal Reasoning in Large Language Models: Reality or Mirage?Code0
Active Inference AI Systems for Scientific Discovery0
Towards Two-Stage Counterfactual Learning to Rank0
Counterfactual Influence as a Distributional Quantity0
Causal Operator Discovery in Partial Differential Equations via Counterfactual Physics-Informed Neural Networks0
Argumentative Ensembling for Robust Recourse under Model MultiplicityCode0
Center of Gravity-Guided Focusing Influence Mechanism for Multi-Agent Reinforcement Learning0
Blameless Users in a Clean Room: Defining Copyright Protection for Generative Models0
Quantifying Fairness in LLMs Beyond Tokens: A Semantic and Statistical Perspective0
Thought Anchors: Which LLM Reasoning Steps Matter?Code2
Social Group Bias in AI Finance0
An introduction to Causal Modelling0
CF-Seg: Counterfactuals meet Segmentation0
The Role of Explanation Styles and Perceived Accuracy on Decision Making in Predictive Process Monitoring0
How does online shopping affect offline price sensitivity?0
Diffusion-based Counterfactual Augmentation: Towards Robust and Interpretable Knee Osteoarthritis GradingCode1
Performative Validity of Recourse Explanations0
Towards Desiderata-Driven Design of Visual Counterfactual Explainers0
Causally Steered Diffusion for Automated Video Counterfactual GenerationCode0
Decoupled Classifier-Free Guidance for Counterfactual Diffusion Models0
Towards Fairness Assessment of Dutch Hate Speech Detection0
Think before You Simulate: Symbolic Reasoning to Orchestrate Neural Computation for Counterfactual Question AnsweringCode0
WGSR-Bench: Wargame-based Game-theoretic Strategic Reasoning Benchmark for Large Language Models0
Towards Robust Multimodal Emotion Recognition under Missing Modalities and Distribution ShiftsCode1
CausalVQA: A Physically Grounded Causal Reasoning Benchmark for Video ModelsCode2
ORIDa: Object-centric Real-world Image Composition Dataset0
Diffusion Counterfactual Generation with Semantic AbductionCode0
Curriculum Learning With Counterfactual Group Relative Policy Advantage For Multi-Agent Reinforcement LearningCode1
Cross-Entropy Games for Language Models: From Implicit Knowledge to General Capability Measures0
CrimeMind: Simulating Urban Crime with Multi-Modal LLM Agents0
Enhancing the Merger Simulation Toolkit with ML/AI0
Flattery, Fluff, and Fog: Diagnosing and Mitigating Idiosyncratic Biases in Preference ModelsCode0
Knowledgeable-r1: Policy Optimization for Knowledge Exploration in Retrieval-Augmented GenerationCode0
Counterfactual reasoning: an analysis of in-context emergenceCode0
Evaluating Large Language Model Capabilities in Assessing Spatial Econometrics Research0
WANDER: An Explainable Decision-Support Framework for HPC0
WorldPrediction: A Benchmark for High-level World Modeling and Long-horizon Procedural Planning0
A meaningful prediction of functional decline in amyotrophic lateral sclerosis based on multi-event survival analysis0
Pricing the Right to Renege in Search Markets: Evidence from Trucking0
Life Sequence Transformer: Generative Modelling for Counterfactual Simulation0
Counterfactual Activation Editing for Post-hoc Prosody and Mispronunciation Correction in TTS Models0
Recover Experimental Data with Selection Bias using Counterfactual Logic0
Ctrl-Crash: Controllable Diffusion for Realistic Car Crashes0
Data Fusion for Partial Identification of Causal Effects0
From Invariant Representations to Invariant Data: Provable Robustness to Spurious Correlations via Noisy Counterfactual MatchingCode0
FOLIAGE: Towards Physical Intelligence World Models Via Unbounded Surface Evolution0
DiCoFlex: Model-agnostic diverse counterfactuals with flexible control0
Show:102550
← PrevPage 1 of 56Next →

No leaderboard results yet.