| Benchmarking Vision Language Models on German Factual Data | Apr 15, 2025 | Benchmarking | —Unverified | 0 |
| Benchmarking Next-Generation Reasoning-Focused Large Language Models in Ophthalmology: A Head-to-Head Evaluation on 5,888 Items | Apr 15, 2025 | BenchmarkingMultiple-choice | —Unverified | 0 |
| Mamba-Based Ensemble learning for White Blood Cell Classification | Apr 15, 2025 | BenchmarkingClassification | CodeCode Available | 0 |
| GaSLight: Gaussian Splats for Spatially-Varying Lighting in HDR | Apr 15, 2025 | Benchmarking | —Unverified | 0 |
| COUNTS: Benchmarking Object Detectors and Multimodal Large Language Models under Distribution Shifts | Apr 14, 2025 | BenchmarkingObject | —Unverified | 0 |
| CameraBench: Benchmarking Visual Reasoning in MLLMs via Photography | Apr 14, 2025 | BenchmarkingVisual Reasoning | —Unverified | 0 |
| BoTTA: Benchmarking on-device Test Time Adaptation | Apr 14, 2025 | BenchmarkingTest-time Adaptation | —Unverified | 0 |
| Benchmarking 3D Human Pose Estimation Models Under Occlusions | Apr 14, 2025 | 3D Human Pose EstimationBenchmarking | —Unverified | 0 |
| Beyond Chains of Thought: Benchmarking Latent-Space Reasoning Abilities in Large Language Models | Apr 14, 2025 | BenchmarkingDescriptive | —Unverified | 0 |
| Foundation Models for Remote Sensing: An Analysis of MLLMs for Object Localization | Apr 14, 2025 | BenchmarkingEarth Observation | —Unverified | 0 |