| Large Language Model Critics for Execution-Free Evaluation of Code Changes | Jan 28, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Multiple Abstraction Level Retrieve Augment Generation | Jan 28, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| AxBench: Steering LLMs? Even Simple Baselines Outperform Sparse Autoencoders | Jan 28, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Optimizing Large Language Model Training Using FP4 Quantization | Jan 28, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling | Jan 28, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Towards Safe AI Clinicians: A Comprehensive Study on Large Language Model Jailbreaking in Healthcare | Jan 27, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| VLMaterial: Procedural Material Generation with Large Vision-Language Models | Jan 27, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Atla Selene Mini: A General Purpose Evaluation Model | Jan 27, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| BiFold: Bimanual Cloth Folding with Language Guidance | Jan 27, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Programming by Examples Meets Historical Linguistics: A Large Language Model Based Approach to Sound Law Induction | Jan 27, 2025 | Code GenerationInductive Bias | —Unverified | 0 |