| Compact Language Models via Pruning and Knowledge Distillation | Jul 19, 2024 | Knowledge DistillationLanguage Modeling | CodeCode Available | 3 | 5 |
| Evaluating Large Language Models Trained on Code | Jul 7, 2021 | Code GenerationHumanEval | CodeCode Available | 3 | 5 |
| Evalverse: Unified and Accessible Library for Large Language Model Evaluation | Apr 1, 2024 | Language Model EvaluationLanguage Modeling | CodeCode Available | 3 | 5 |
| Revisiting Pre-Trained Models for Chinese Natural Language Processing | Apr 29, 2020 | Language ModelingLanguage Modelling | CodeCode Available | 3 | 5 |
| LlamaDuo: LLMOps Pipeline for Seamless Migration from Service LLMs to Small-Scale Local LLMs | Aug 24, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 | 5 |
| Llama Scope: Extracting Millions of Features from Llama-3.1-8B with Sparse Autoencoders | Oct 27, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 | 5 |
| Longformer: The Long-Document Transformer | Apr 10, 2020 | DecoderLanguage Modeling | CodeCode Available | 3 | 5 |
| Agent Workflow Memory | Sep 11, 2024 | AI AgentLanguage Modeling | CodeCode Available | 3 | 5 |
| Lingma SWE-GPT: An Open Development-Process-Centric Language Model for Automated Software Improvement | Nov 1, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 | 5 |
| A Systematic Evaluation of Large Language Models of Code | Feb 26, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 3 | 5 |
| Lifelong Learning of Large Language Model based Agents: A Roadmap | Jan 13, 2025 | Incremental LearningLanguage Modeling | CodeCode Available | 3 | 5 |
| LayerKV: Optimizing Large Language Model Serving with Layer-wise KV Cache Management | Oct 1, 2024 | GPULanguage Modeling | CodeCode Available | 3 | 5 |
| LaViDa: A Large Diffusion Language Model for Multimodal Understanding | May 22, 2025 | Instruction FollowingLanguage Modeling | CodeCode Available | 3 | 5 |
| Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuray | Feb 7, 2025 | 4kGeneral Knowledge | CodeCode Available | 3 | 5 |
| MotionGPT: Human Motion as a Foreign Language | Jun 26, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 3 | 5 |
| COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training | Oct 25, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 | 5 |
| A Survey on the Optimization of Large Language Model-based Agents | Mar 16, 2025 | Decision MakingLanguage Modeling | CodeCode Available | 3 | 5 |
| AsymLoRA: Harmonizing Data Conflicts and Commonalities in MLLMs | Feb 27, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 3 | 5 |
| Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference | Mar 21, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 | 5 |
| Large Language Model based Long-tail Query Rewriting in Taobao Search | Nov 7, 2023 | Contrastive LearningLanguage Modeling | CodeCode Available | 3 | 5 |
| A Review of Prominent Paradigms for LLM-Based Agents: Tool Use (Including RAG), Planning, and Feedback Learning | Jun 9, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 | 5 |
| A Survey on the Memory Mechanism of Large Language Model based Agents | Apr 21, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 | 5 |
| Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model | Aug 30, 2024 | Audio CompressionAudio Generation | CodeCode Available | 3 | 5 |
| Language Models are Few-Shot Learners | May 28, 2020 | answerability predictionArticles | CodeCode Available | 3 | 5 |
| Cleaner Pretraining Corpus Curation with Neural Web Scraping | Feb 22, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 | 5 |