| Llama Scope: Extracting Millions of Features from Llama-3.1-8B with Sparse Autoencoders | Oct 27, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Centaur: a foundation model of human cognition | Oct 26, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training | Oct 25, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Scaling up Masked Diffusion Models on Text | Oct 24, 2024 | GSM8KLanguage Modeling | CodeCode Available | 3 |
| Scaling Diffusion Language Models via Adaptation from Autoregressive Models | Oct 23, 2024 | In-Context LearningLanguage Modeling | CodeCode Available | 3 |
| DPLM-2: A Multimodal Diffusion Protein Language Model | Oct 17, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| PRefLexOR: Preference-based Recursive Language Modeling for Exploratory Optimization of Reasoning and Agentic Thinking | Oct 16, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Predicting from Strings: Language Model Embeddings for Bayesian Optimization | Oct 14, 2024 | Bayesian OptimizationExperimental Design | CodeCode Available | 3 |
| Baichuan-Omni Technical Report | Oct 11, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference | Oct 6, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |