| Inference on the value of a linear program | Jun 7, 2025 | valid | —Unverified | 0 |
| On Efficient Estimation of Distributional Treatment Effects under Covariate-Adaptive Randomization | Jun 6, 2025 | regressionvalid | CodeCode Available | 0 |
| Speech Neurophysiology in Realistic Contexts: Big Hype or Big Leap? | Jun 5, 2025 | valid | —Unverified | 0 |
| Does It Make Sense to Speak of Introspection in Large Language Models? | Jun 5, 2025 | valid | —Unverified | 0 |
| SQLens: An End-to-End Framework for Error Detection and Correction in Text-to-SQL | Jun 4, 2025 | Text to SQLText-To-SQL | —Unverified | 0 |
| DrSR: LLM based Scientific Equation Discovery with Dual Reasoning from Data and Experience | Jun 4, 2025 | Efficient ExplorationEquation Discovery | —Unverified | 0 |
| DRE: An Effective Dual-Refined Method for Integrating Small and Large Language Models in Open-Domain Dialogue Evaluation | Jun 4, 2025 | Dialogue Evaluationvalid | —Unverified | 0 |
| Pi-SQL: Enhancing Text-to-SQL with Fine-Grained Guidance from Pivot Programming Languages | Jun 1, 2025 | Text to SQLText-To-SQL | —Unverified | 0 |
| Quantization-based Bounds on the Wasserstein Metric | Jun 1, 2025 | Computational EfficiencyDomain Adaptation | —Unverified | 0 |
| Behavioral Augmentation of UML Class Diagrams: An Empirical Study of Large Language Models for Method Generation | Jun 1, 2025 | Model SelectionPrompt Engineering | CodeCode Available | 0 |
| Clinical Annotations for Automatic Stuttering Severity Assessment | May 31, 2025 | valid | CodeCode Available | 0 |
| CSVQA: A Chinese Multimodal Benchmark for Evaluating STEM Reasoning Capabilities of VLMs | May 30, 2025 | DiagnosticImage Comprehension | —Unverified | 0 |
| Stable Thompson Sampling: Valid Inference via Variance Inflation | May 29, 2025 | Decision MakingThompson Sampling | —Unverified | 0 |
| Conformal Object Detection by Sequential Risk Control | May 29, 2025 | Conformal PredictionObject | —Unverified | 0 |
| Generalizability vs. Counterfactual Explainability Trade-Off | May 29, 2025 | counterfactualvalid | —Unverified | 0 |
| Maximum Likelihood Learning of Latent Dynamics Without Reconstruction | May 29, 2025 | Schedulingvalid | —Unverified | 0 |
| What Has Been Lost with Synthetic Evaluation? | May 28, 2025 | NegationReading Comprehension | —Unverified | 0 |
| Automatic Transmission for LLM Tiers: Optimizing Cost and Accuracy in Large Language Models | May 27, 2025 | valid | CodeCode Available | 0 |
| STACI: Spatio-Temporal Aleatoric Conformal Inference | May 27, 2025 | Gaussian ProcessesGPU | —Unverified | 0 |
| PrivATE: Differentially Private Confidence Intervals for Average Treatment Effects | May 27, 2025 | Privacy PreservingUncertainty Quantification | —Unverified | 0 |
| On the Robustness of RSMA to Adversarial BD-RIS-Induced Interference | May 26, 2025 | valid | —Unverified | 0 |
| We Need to Measure Data Diversity in NLP -- Better and Broader | May 26, 2025 | Diversityvalid | —Unverified | 0 |
| Regret Analysis of Average-Reward Unichain MDPs via an Actor-Critic Approach | May 26, 2025 | TARvalid | —Unverified | 0 |
| PAMD: Plausibility-Aware Motion Diffusion Model for Long Dance Generation | May 26, 2025 | valid | —Unverified | 0 |
| Collision- and Reachability-Aware Multi-Robot Control with Grounded LLM Planners | May 26, 2025 | MuJoCovalid | —Unverified | 0 |