| FRED: The Florence RGB-Event Drone Dataset | Jun 5, 2025 | BenchmarkingTrajectory Forecasting | —Unverified | 0 |
| Free Performance Gain from Mixing Multiple Partially Labeled Samples in Multi-label Image Classification | May 24, 2024 | BenchmarkingData Augmentation | —Unverified | 0 |
| From 2D to 3D: Re-thinking Benchmarking of Monocular Depth Prediction | Mar 15, 2022 | 3D geometryBenchmarking | —Unverified | 0 |
| From Audio Encoders to Piano Judges: Benchmarking Performance Understanding for Solo Piano | Jul 5, 2024 | AttributeBenchmarking | —Unverified | 0 |
| From Blind Solvers to Logical Thinkers: Benchmarking LLMs' Logical Integrity on Faulty Mathematical Problems | Oct 24, 2024 | BenchmarkingCommon Sense Reasoning | —Unverified | 0 |
| From Code to Play: Benchmarking Program Search for Games Using Large Language Models | Dec 5, 2024 | Atari GamesBenchmarking | —Unverified | 0 |
| From Environmental Sound Representation to Robustness of 2D CNN Models Against Adversarial Attacks | Apr 14, 2022 | Adversarial AttackAdversarial Robustness | —Unverified | 0 |
| From Generalist to Specialist: Improving Large Language Models for Medical Physics Using ARCoT | May 17, 2024 | BenchmarkingMultiple-choice | —Unverified | 0 |
| From Generation to Detection: A Multimodal Multi-Task Dataset for Benchmarking Health Misinformation | May 24, 2025 | ArticlesBenchmarking | —Unverified | 0 |
| From Grounding to Planning: Benchmarking Bottlenecks in Web Agents | Sep 3, 2024 | Benchmarking | —Unverified | 0 |