| Order Doesn't Matter, But Reasoning Does: Training LLMs with Order-Centric Augmentation | Feb 27, 2025 | Data AugmentationLogical Reasoning | —Unverified | 0 |
| Universality of conformal prediction under the assumption of randomness | Feb 26, 2025 | Conformal PredictionPrediction | —Unverified | 0 |
| Overcoming Dependent Censoring in the Evaluation of Survival Models | Feb 26, 2025 | Survival Analysisvalid | CodeCode Available | 0 |
| Talking to the brain: Using Large Language Models as Proxies to Model Brain Semantic Representation | Feb 26, 2025 | Question Answeringvalid | —Unverified | 0 |
| Shh, don't say that! Domain Certification in LLMs | Feb 26, 2025 | valid | —Unverified | 0 |
| Uncertainty Quantification for LLM-Based Survey Simulations | Feb 25, 2025 | SurveyUncertainty Quantification | —Unverified | 0 |
| Beyond In-Distribution Success: Scaling Curves of CoT Granularity for Language Model Generalization | Feb 25, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Data-Driven Input-Output Control Barrier Functions | Feb 24, 2025 | State Estimationvalid | —Unverified | 0 |
| Quantifying Logical Consistency in Transformers via Query-Key Alignment | Feb 24, 2025 | Logical Reasoningvalid | —Unverified | 0 |
| REGen: A Reliable Evaluation Framework for Generative Event Argument Extraction | Feb 24, 2025 | Event Argument Extractionvalid | —Unverified | 0 |
| Your Assumed DAG is Wrong and Here's How To Deal With It | Feb 24, 2025 | Causal Discoveryvalid | CodeCode Available | 0 |
| Auto-Bench: An Automated Benchmark for Scientific Discovery in LLMs | Feb 21, 2025 | scientific discoveryvalid | —Unverified | 0 |
| Pricing Valid Cuts for Price-Match Equilibria | Feb 21, 2025 | valid | —Unverified | 0 |
| EquivaMap: Leveraging LLMs for Automatic Equivalence Checking of Optimization Formulations | Feb 20, 2025 | Combinatorial Optimizationvalid | CodeCode Available | 0 |
| Towards a Perspectivist Turn in Argument Quality Assessment | Feb 20, 2025 | valid | CodeCode Available | 0 |
| Explainable Distributed Constraint Optimization Problems | Feb 19, 2025 | valid | —Unverified | 0 |
| Conformal Prediction under Levy-Prokhorov Distribution Shifts: Robustness to Local and Global Perturbations | Feb 19, 2025 | Conformal PredictionPrediction | CodeCode Available | 0 |
| Generalization error bound for denoising score matching under relaxed manifold assumption | Feb 19, 2025 | Denoisingvalid | —Unverified | 0 |
| What are Models Thinking about? Understanding Large Language Model Hallucinations "Psychology" through Model Inner State Analysis | Feb 19, 2025 | HallucinationLanguage Modeling | —Unverified | 0 |
| Likelihood-Ratio Regularized Quantile Regression: Adapting Conformal Prediction to High-Dimensional Covariate Shifts | Feb 18, 2025 | Conformal Predictionimage-classification | —Unverified | 0 |
| GiFT: Gibbs Fine-Tuning for Code Generation | Feb 17, 2025 | Code Generationvalid | CodeCode Available | 0 |
| Deep Incomplete Multi-view Learning via Cyclic Permutation of VAEs | Feb 16, 2025 | MULTI-VIEW LEARNINGRepresentation Learning | —Unverified | 0 |
| The Relationship between No-Regret Learning and Online Conformal Prediction | Feb 16, 2025 | Conformal Predictionvalid | —Unverified | 0 |
| A new and flexible class of sharp asymptotic time-uniform confidence sequences | Feb 14, 2025 | valid | —Unverified | 0 |
| Self-Normalized Inference in (Quantile, Expected Shortfall) Regressions for Time Series | Feb 14, 2025 | quantile regressionregression | —Unverified | 0 |
| Multi-Objective Planning with Contextual Lexicographic Reward Preferences | Feb 13, 2025 | valid | —Unverified | 0 |
| Trust Me, I Know the Way: Predictive Uncertainty in the Presence of Shortcut Learning | Feb 13, 2025 | valid | —Unverified | 0 |
| Generalizability through Explainability: Countering Overfitting with Counterfactual Examples | Feb 13, 2025 | counterfactualData Augmentation | —Unverified | 0 |
| CRANE: Reasoning with constrained LLM generation | Feb 13, 2025 | Code GenerationMath | —Unverified | 0 |
| High-Throughput SAT Sampling | Feb 12, 2025 | GPUvalid | CodeCode Available | 0 |
| Inference in dynamic models for panel data using the moving block bootstrap | Feb 12, 2025 | valid | —Unverified | 0 |
| On Training-Conditional Conformal Prediction and Binomial Proportion Confidence Intervals | Feb 11, 2025 | Conformal PredictionUncertainty Quantification | —Unverified | 0 |
| Beyond Confidence: Adaptive Abstention in Dual-Threshold Conformal Prediction for Autonomous System Perception | Feb 11, 2025 | Conformal PredictionUncertainty Quantification | CodeCode Available | 0 |
| Experiments in the Linear Convex Order | Feb 10, 2025 | valid | —Unverified | 0 |
| Krum Federated Chain (KFC): Using blockchain to defend against adversarial attacks in Federated Learning | Feb 10, 2025 | Federated Learningimage-classification | CodeCode Available | 0 |
| Dual Conic Proxy for Semidefinite Relaxation of AC Optimal Power Flow | Feb 10, 2025 | Self-Supervised Learningvalid | —Unverified | 0 |
| Tokenization Standards for Linguistic Integrity: Turkish as a Benchmark | Feb 10, 2025 | MMLUMorphological Analysis | —Unverified | 0 |
| On the Impact of the Utility in Semivalue-based Data Valuation | Feb 10, 2025 | Data Valuationvalid | —Unverified | 0 |
| Smooth Sailing: Lipschitz-Driven Uncertainty Quantification for Spatial Association | Feb 9, 2025 | EpidemiologyUncertainty Quantification | CodeCode Available | 0 |
| Forbidden Science: Dual-Use AI Challenge Benchmark and Scientific Refusal Tests | Feb 8, 2025 | valid | —Unverified | 0 |
| Generative-enhanced optimization for knapsack problems: an industry-relevant study | Feb 7, 2025 | Tensor Networksvalid | —Unverified | 0 |
| t-Testing the Waters: Empirically Validating Assumptions for Reliable A/B-Testing | Feb 7, 2025 | Experimental Designvalid | —Unverified | 0 |
| Automating a Complete Software Test Process Using LLMs: An Automotive Case Study | Feb 6, 2025 | valid | —Unverified | 0 |
| Combining Clusters for the Approximate Randomization Test | Feb 6, 2025 | valid | —Unverified | 0 |
| First-ish Order Methods: Hessian-aware Scalings of Gradient Descent | Feb 6, 2025 | valid | —Unverified | 0 |
| Efficient Randomized Experiments Using Foundation Models | Feb 6, 2025 | valid | CodeCode Available | 0 |
| Change Point Detection in the Frequency Domain with Statistical Reliability | Feb 5, 2025 | Change Point Detectionvalid | —Unverified | 0 |
| SymmCD: Symmetry-Preserving Crystal Generation with Diffusion Models | Feb 5, 2025 | valid | CodeCode Available | 1 |
| FAB-PPI: Frequentist, Assisted by Bayes, Prediction-Powered Inference | Feb 4, 2025 | Predictionvalid | —Unverified | 0 |
| Variance-Adjusted Cosine Distance as Similarity Metric | Feb 4, 2025 | valid | —Unverified | 0 |