| Finance Language Model Evaluation (FLaME) | Jun 18, 2025 | BenchmarkingLanguage Model Evaluation | —Unverified | 0 |
| BMFM-RNA: An Open Framework for Building and Evaluating Transcriptomic Foundation Models | Jun 17, 2025 | BenchmarkingLanguage Modeling | CodeCode Available | 2 |
| Q2SAR: A Quantum Multiple Kernel Learning Approach for Drug Discovery | Jun 17, 2025 | BenchmarkingDrug Discovery | —Unverified | 0 |
| PGLib-CO2: A Power Grid Library for Computing and Optimizing Carbon Emissions | Jun 17, 2025 | Benchmarking | —Unverified | 0 |
| GUI-Robust: A Comprehensive Dataset for Testing GUI Agent Robustness in Real-World Anomalies | Jun 17, 2025 | Benchmarking | CodeCode Available | 1 |
| ImpliRet: Benchmarking the Implicit Fact Retrieval Challenge | Jun 17, 2025 | BenchmarkingRetrieval | CodeCode Available | 0 |
| A large-scale heterogeneous 3D magnetic resonance brain imaging dataset for self-supervised learning | Jun 17, 2025 | BenchmarkingSelf-Supervised Learning | —Unverified | 0 |
| Egocentric Human-Object Interaction Detection: A New Benchmark and Method | Jun 17, 2025 | BenchmarkingHuman-Object Interaction Detection | —Unverified | 0 |
| Deep Diffusion Models and Unsupervised Hyperspectral Unmixing for Realistic Abundance Map Synthesis | Jun 16, 2025 | BenchmarkingData Augmentation | —Unverified | 0 |
| The Price of Freedom: Exploring Expressivity and Runtime Tradeoffs in Equivariant Tensor Products | Jun 16, 2025 | Benchmarking | CodeCode Available | 1 |