| GEMMAS: Graph-based Evaluation Metrics for Multi Agent Systems | Jul 17, 2025 | DiversityGSM8K | —Unverified | 0 |
| DAC: A Dynamic Attention-aware Approach for Task-Agnostic Prompt Compression | Jul 16, 2025 | GSM8K | CodeCode Available | 0 |
| KisMATH: Do LLMs Have Knowledge of Implicit Structures in Mathematical Reasoning? | Jul 15, 2025 | GSM8KLanguage Modeling | —Unverified | 0 |
| CoRE: Enhancing Metacognition with Label-free Self-evaluation in LRMs | Jul 8, 2025 | GSM8KMath | —Unverified | 0 |
| Activation Steering for Chain-of-Thought Compression | Jul 7, 2025 | GSM8KMath | CodeCode Available | 0 |
| any4: Learned 4-bit Numeric Representation for LLMs | Jul 7, 2025 | GPUGSM8K | CodeCode Available | 2 |
| IRanker: Towards Ranking Foundation Model | Jun 25, 2025 | GSM8Kmodel | CodeCode Available | 1 |
| Scaling Speculative Decoding with Lookahead Reasoning | Jun 24, 2025 | GPUGSM8K | CodeCode Available | 0 |
| Plan for Speed -- Dilated Scheduling for Masked Diffusion Language Models | Jun 23, 2025 | Code CompletionGSM8K | —Unverified | 0 |
| CommVQ: Commutative Vector Quantization for KV Cache Compression | Jun 23, 2025 | GPUGSM8K | CodeCode Available | 1 |