| Lifelong Learning of Large Language Model based Agents: A Roadmap | Jan 13, 2025 | Incremental LearningLanguage Modeling | CodeCode Available | 3 |
| LLaVA-Phi: Efficient Multi-Modal Assistant with Small Language Model | Jan 4, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference | Mar 21, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training | Oct 25, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model | Aug 30, 2024 | Audio CompressionAudio Generation | CodeCode Available | 3 |
| GLM: General Language Model Pretraining with Autoregressive Blank Infilling | Mar 18, 2021 | Abstractive Text SummarizationClassification | CodeCode Available | 3 |
| 8-bit Optimizers via Block-wise Quantization | Oct 6, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Large Language Model-Brained GUI Agents: A Survey | Nov 27, 2024 | Code GenerationLanguage Modeling | CodeCode Available | 3 |
| ContextCite: Attributing Model Generation to Context | Sep 1, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference | Oct 6, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| LaViDa: A Large Diffusion Language Model for Multimodal Understanding | May 22, 2025 | Instruction FollowingLanguage Modeling | CodeCode Available | 3 |
| LayerKV: Optimizing Large Language Model Serving with Layer-wise KV Cache Management | Oct 1, 2024 | GPULanguage Modeling | CodeCode Available | 3 |
| Llemma: An Open Language Model For Mathematics | Oct 16, 2023 | Arithmetic ReasoningAutomated Theorem Proving | CodeCode Available | 3 |
| Language Models are Few-Shot Learners | May 28, 2020 | answerability predictionArticles | CodeCode Available | 3 |
| Cleaner Pretraining Corpus Curation with Neural Web Scraping | Feb 22, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Advancing Speech Language Models by Scaling Supervised Fine-Tuning with Over 60,000 Hours of Synthetic Speech Dialogue Data | Dec 2, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| 1.5-Pints Technical Report: Pretraining in Days, Not Months -- Your Language Model Thrives on Quality Data | Aug 7, 2024 | 16k2k | CodeCode Available | 3 |
| Language Model Inversion | Nov 22, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Agent Workflow Memory | Sep 11, 2024 | AI AgentLanguage Modeling | CodeCode Available | 3 |
| SWEET-RL: Training Multi-Turn LLM Agents on Collaborative Reasoning Tasks | Mar 19, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| LangFair: A Python Package for Assessing Bias and Fairness in Large Language Model Use Cases | Jan 6, 2025 | FairnessLanguage Modeling | CodeCode Available | 3 |
| A Comprehensive Survey on Long Context Language Modeling | Mar 20, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey | Feb 8, 2024 | ArticlesEntity Alignment | CodeCode Available | 3 |
| Language Model Council: Democratically Benchmarking Foundation Models on Highly Subjective Tasks | Jun 12, 2024 | BenchmarkingChatbot | CodeCode Available | 3 |
| Large Language Model based Long-tail Query Rewriting in Taobao Search | Nov 7, 2023 | Contrastive LearningLanguage Modeling | CodeCode Available | 3 |