| Sum Rate Maximization for Pinching Antennas Assisted RSMA System With Multiple Waveguides | Jun 12, 2025 | Benchmarking | —Unverified | 0 |
| OIBench: Benchmarking Strong Reasoning Models with Olympiad in Informatics | Jun 12, 2025 | Benchmarking | —Unverified | 0 |
| Primender Sequence: A Novel Mathematical Construct for Testing Symbolic Inference and AI Reasoning | Jun 12, 2025 | Benchmarking | —Unverified | 0 |
| SDialog: A Python Toolkit for Synthetic Dialogue Generation and Analysis | Jun 12, 2025 | BenchmarkingDialogue Generation | CodeCode Available | 2 |
| Bench to the Future: A Pastcasting Benchmark for Forecasting Agents | Jun 11, 2025 | Benchmarking | —Unverified | 0 |
| ICE-ID: A Novel Historical Census Data Benchmark Comparing NARS against LLMs, \& a ML Ensemble on Longitudinal Identity Resolution | Jun 11, 2025 | Benchmarking | —Unverified | 0 |
| ScholarSearch: Benchmarking Scholar Searching Ability of LLMs | Jun 11, 2025 | BenchmarkingInformation Retrieval | —Unverified | 0 |
| Reasoning as a Resource: Optimizing Fast and Slow Thinking in Code Generation Models | Jun 11, 2025 | BenchmarkingCode Generation | —Unverified | 0 |
| Attention, Please! Revisiting Attentive Probing for Masked Image Modeling | Jun 11, 2025 | BenchmarkingComputational Efficiency | CodeCode Available | 1 |
| GLGENN: A Novel Parameter-Light Equivariant Neural Networks Architecture Based on Clifford Geometric Algebras | Jun 11, 2025 | Benchmarking | CodeCode Available | 1 |