| Fence decompositions and cherry covers in non-binary phylogenetic networks | Jul 11, 2024 | ARC | —Unverified | 0 |
| metabench -- A Sparse Benchmark to Measure General Ability in Large Language Models | Jul 4, 2024 | ARCGSM8K | CodeCode Available | 0 |
| VarBench: Robust Language Model Benchmarking Through Dynamic Variable Perturbation | Jun 25, 2024 | ARCBenchmarking | CodeCode Available | 0 |
| LLM-ARC: Enhancing LLMs with an Automated Reasoning Critic | Jun 25, 2024 | ARCLogical Reasoning | —Unverified | 0 |
| PORT: Preference Optimization on Reasoning Traces | Jun 23, 2024 | ARCGSM8K | —Unverified | 0 |
| Promises, Outlooks and Challenges of Diffusion Language Modeling | Jun 17, 2024 | ARCHellaSwag | —Unverified | 0 |
| Circular transformation of the European steel industry renders scrap metal a strategic resource | Jun 17, 2024 | ARC | —Unverified | 0 |
| Regularizing Numerical Extremals Along Singular Arcs: A Lie-Theoretic Approach | Jun 11, 2024 | ARC | —Unverified | 0 |
| A One-Layer Decoder-Only Transformer is a Two-Layer RNN: With an Application to Certified Robustness | May 27, 2024 | ARCDecoder | —Unverified | 0 |
| Adaptive Gradient Clipping for Robust Federated Learning | May 23, 2024 | ARCFederated Learning | —Unverified | 0 |