| AEON: Adaptive Estimation of Instance-Dependent In-Distribution and Out-of-Distribution Label Noise for Robust Learning | Jan 23, 2025 | Benchmarkingimage-classification | —Unverified | 0 |
| You Only Crash Once v2: Perceptually Consistent Strong Features for One-Stage Domain Adaptive Detection of Space Terrain | Jan 23, 2025 | BenchmarkingDomain Adaptation | —Unverified | 0 |
| DI-BENCH: Benchmarking Large Language Models on Dependency Inference with Testable Repositories at Scale | Jan 23, 2025 | Benchmarking | —Unverified | 0 |
| Leveraging LLMs to Create a Haptic Devices' Recommendation System | Jan 22, 2025 | Benchmarking | —Unverified | 0 |
| CHaRNet: Conditioned Heatmap Regression for Robust Dental Landmark Localization | Jan 22, 2025 | Benchmarkingregression | —Unverified | 0 |
| Implicit Causality-biases in humans and LLMs as a tool for benchmarking LLM discourse capabilities | Jan 22, 2025 | BenchmarkingReferring Expression | —Unverified | 0 |
| RAG-Reward: Optimizing RAG with Reward Modeling and RLHF | Jan 22, 2025 | BenchmarkingHallucination | —Unverified | 0 |
| Does Table Source Matter? Benchmarking and Improving Multimodal Scientific Table Understanding and Reasoning | Jan 22, 2025 | Benchmarking | CodeCode Available | 0 |
| Benchmarking Generative AI for Scoring Medical Student Interviews in Objective Structured Clinical Examinations (OSCEs) | Jan 21, 2025 | Benchmarking | —Unverified | 0 |
| Optimally-Weighted Maximum Mean Discrepancy Framework for Continual Learning | Jan 21, 2025 | BenchmarkingContinual Learning | —Unverified | 0 |