| Gemstones: A Model Suite for Multi-Faceted Scaling Laws | Feb 7, 2025 | Experimental DesignLanguage Modeling | CodeCode Available | 1 |
| Position-aware Automatic Circuit Discovery | Feb 7, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Division-of-Thoughts: Harnessing Hybrid Language Model Synergy for Efficient On-Device Agents | Feb 6, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Great Models Think Alike and this Undermines AI Oversight | Feb 6, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| ADIFF: Explaining audio difference using natural language | Feb 6, 2025 | AudioCapsAudio captioning | CodeCode Available | 1 |
| Robotouille: An Asynchronous Planning Benchmark for LLM Agents | Feb 6, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Gompertz Linear Units: Leveraging Asymmetry for Enhanced Learning Dynamics | Feb 5, 2025 | image-classificationImage Classification | CodeCode Available | 1 |
| Do Large Language Model Benchmarks Test Reliability? | Feb 5, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Enhancing Reasoning to Adapt Large Language Models for Domain-Specific Applications | Feb 5, 2025 | In-Context LearningLanguage Modeling | CodeCode Available | 1 |
| Intent Representation Learning with Large Language Model for Recommendation | Feb 5, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |