| Embodied CoT Distillation From LLM To Off-the-shelf Agents | Dec 16, 2024 | Decision MakingIn-Context Learning | CodeCode Available | 3 |
| BatchTopK Sparse Autoencoders | Dec 9, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| PaliGemma 2: A Family of Versatile VLMs for Transfer | Dec 4, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| From Individual to Society: A Survey on Social Simulation Driven by Large Language Model-based Agents | Dec 4, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Advancing Speech Language Models by Scaling Supervised Fine-Tuning with Over 60,000 Hours of Synthetic Speech Dialogue Data | Dec 2, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| HackSynth: LLM Agent and Evaluation Framework for Autonomous Penetration Testing | Dec 2, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Large Language Model-Brained GUI Agents: A Survey | Nov 27, 2024 | Code GenerationLanguage Modeling | CodeCode Available | 3 |
| On the Efficiency of NLP-Inspired Methods for Tabular Deep Learning | Nov 26, 2024 | Computational EfficiencyDeep Learning | CodeCode Available | 3 |
| Pushing the Limits of Large Language Model Quantization via the Linearity Theorem | Nov 26, 2024 | GPULanguage Modeling | CodeCode Available | 3 |
| BayLing 2: A Multilingual Large Language Model with Efficient Language Alignment | Nov 25, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| SemiKong: Curating, Training, and Evaluating A Semiconductor Industry-Specific Large Language Model | Nov 21, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| The Surprising Effectiveness of Test-Time Training for Few-Shot Learning | Nov 11, 2024 | ARCFew-Shot Learning | CodeCode Available | 3 |
| SuffixDecoding: Extreme Speculative Decoding for Emerging AI Applications | Nov 7, 2024 | Code GenerationLanguage Modeling | CodeCode Available | 3 |
| Rule Based Rewards for Language Model Safety | Nov 2, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Lingma SWE-GPT: An Open Development-Process-Centric Language Model for Automated Software Improvement | Nov 1, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Llama Scope: Extracting Millions of Features from Llama-3.1-8B with Sparse Autoencoders | Oct 27, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Centaur: a foundation model of human cognition | Oct 26, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training | Oct 25, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Scaling up Masked Diffusion Models on Text | Oct 24, 2024 | GSM8KLanguage Modeling | CodeCode Available | 3 |
| Scaling Diffusion Language Models via Adaptation from Autoregressive Models | Oct 23, 2024 | In-Context LearningLanguage Modeling | CodeCode Available | 3 |
| DPLM-2: A Multimodal Diffusion Protein Language Model | Oct 17, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| PRefLexOR: Preference-based Recursive Language Modeling for Exploratory Optimization of Reasoning and Agentic Thinking | Oct 16, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Predicting from Strings: Language Model Embeddings for Bayesian Optimization | Oct 14, 2024 | Bayesian OptimizationExperimental Design | CodeCode Available | 3 |
| Baichuan-Omni Technical Report | Oct 11, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference | Oct 6, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |