| LoQT: Low-Rank Adapters for Quantized Pretraining | May 26, 2024 | GPULanguage Modeling | CodeCode Available | 2 |
| AdaFisher: Adaptive Second Order Optimization via Fisher Information | May 26, 2024 | Computational Efficiencyimage-classification | CodeCode Available | 2 |
| MoEUT: Mixture-of-Experts Universal Transformers | May 25, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| LM4LV: A Frozen Large Language Model for Low-level Vision Tasks | May 24, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Sparse maximal update parameterization: A holistic approach to sparse training dynamics | May 24, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Composed Image Retrieval for Remote Sensing | May 24, 2024 | Composed Image Retrieval (CoIR)Descriptive | CodeCode Available | 2 |
| Large language models can be zero-shot anomaly detectors for time series? | May 23, 2024 | Anomaly DetectionLanguage Modeling | CodeCode Available | 2 |
| Not All Language Model Features Are Linear | May 23, 2024 | AllLanguage Modeling | CodeCode Available | 2 |
| Extracting Prompts by Inverting LLM Outputs | May 23, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Vikhr: Constructing a State-of-the-art Bilingual Open-Source Instruction-Following Large Language Model for Russian | May 22, 2024 | Instruction FollowingLanguage Modeling | CodeCode Available | 2 |