| 8-bit Optimizers via Block-wise Quantization | Oct 6, 2021 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| SparseVLM: Visual Token Sparsification for Efficient Vision-Language Model Inference | Oct 6, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| LlamaDuo: LLMOps Pipeline for Seamless Migration from Service LLMs to Small-Scale Local LLMs | Aug 24, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| LLaVA-Phi: Efficient Multi-Modal Assistant with Small Language Model | Jan 4, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Compact Language Models via Pruning and Knowledge Distillation | Jul 19, 2024 | Knowledge DistillationLanguage Modeling | CodeCode Available | 3 |
| GLM: General Language Model Pretraining with Autoregressive Blank Infilling | Mar 18, 2021 | Abstractive Text SummarizationClassification | CodeCode Available | 3 |
| Lifelong Learning of Large Language Model based Agents: A Roadmap | Jan 13, 2025 | Incremental LearningLanguage Modeling | CodeCode Available | 3 |
| Lingma SWE-GPT: An Open Development-Process-Centric Language Model for Automated Software Improvement | Nov 1, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Llemma: An Open Language Model For Mathematics | Oct 16, 2023 | Arithmetic ReasoningAutomated Theorem Proving | CodeCode Available | 3 |
| Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model | Aug 30, 2024 | Audio CompressionAudio Generation | CodeCode Available | 3 |
| SVD-LLM V2: Optimizing Singular Value Truncation for Large Language Model Compression | Mar 16, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference | Mar 21, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| LayerKV: Optimizing Large Language Model Serving with Layer-wise KV Cache Management | Oct 1, 2024 | GPULanguage Modeling | CodeCode Available | 3 |
| Large Language Model-Brained GUI Agents: A Survey | Nov 27, 2024 | Code GenerationLanguage Modeling | CodeCode Available | 3 |
| LaViDa: A Large Diffusion Language Model for Multimodal Understanding | May 22, 2025 | Instruction FollowingLanguage Modeling | CodeCode Available | 3 |
| COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training | Oct 25, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| 1.5-Pints Technical Report: Pretraining in Days, Not Months -- Your Language Model Thrives on Quality Data | Aug 7, 2024 | 16k2k | CodeCode Available | 3 |
| Cleaner Pretraining Corpus Curation with Neural Web Scraping | Feb 22, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| The Breeze 2 Herd of Models: Traditional Chinese LLMs Based on Llama with Vision-Aware and Function-Calling Capabilities | Jan 23, 2025 | General KnowledgeInstruction Following | CodeCode Available | 3 |
| A Comprehensive Survey on Long Context Language Modeling | Mar 20, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Large Language Model based Long-tail Query Rewriting in Taobao Search | Nov 7, 2023 | Contrastive LearningLanguage Modeling | CodeCode Available | 3 |
| Language Models are Few-Shot Learners | May 28, 2020 | answerability predictionArticles | CodeCode Available | 3 |
| Language Model Inversion | Nov 22, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Data Filtering Networks | Sep 29, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| On the Efficiency of NLP-Inspired Methods for Tabular Deep Learning | Nov 26, 2024 | Computational EfficiencyDeep Learning | CodeCode Available | 3 |
| LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction | Jul 16, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| LaneGraph2Seq: Lane Topology Extraction with Language Model via Vertex-Edge Encoding and Connectivity Enhancement | Jan 31, 2024 | Autonomous DrivingLanguage Modeling | CodeCode Available | 2 |
| KV Shifting Attention Enhances Language Modeling | Nov 29, 2024 | In-Context LearningLanguage Modeling | CodeCode Available | 2 |
| Knowledge Representation Learning: A Quantitative Review | Dec 28, 2018 | General ClassificationInformation Retrieval | CodeCode Available | 2 |
| ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation | Jan 11, 2025 | Chart UnderstandingCode Generation | CodeCode Available | 2 |
| Chat-3D: Data-efficiently Tuning Large Language Model for Universal Dialogue of 3D Scenes | Aug 17, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| KoSBi: A Dataset for Mitigating Social Bias Risks Towards Safer Large Language Model Application | May 28, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| KG-FIT: Knowledge Graph Fine-Tuning Upon Open-World Knowledge | May 26, 2024 | Graph EmbeddingInformativeness | CodeCode Available | 2 |
| Keeping Yourself is Important in Downstream Tuning Multimodal Large Language Model | Mar 6, 2025 | General KnowledgeImage Captioning | CodeCode Available | 2 |
| KICGPT: Large Language Model with Knowledge in Context for Knowledge Graph Completion | Feb 4, 2024 | In-Context LearningKnowledge Graph Completion | CodeCode Available | 2 |
| Characterization of Large Language Model Development in the Datacenter | Mar 12, 2024 | GPULanguage Modeling | CodeCode Available | 2 |
| VLKEB: A Large Vision-Language Model Knowledge Editing Benchmark | Mar 12, 2024 | knowledge editingLanguage Modeling | CodeCode Available | 2 |
| Knowledge Circuits in Pretrained Transformers | May 28, 2024 | In-Context Learningknowledge editing | CodeCode Available | 2 |
| Jailbreaking Attack against Multimodal Large Language Model | Feb 4, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Just read twice: closing the recall gap for recurrent language models | Jul 7, 2024 | In-Context LearningLanguage Modeling | CodeCode Available | 2 |
| ChartAssisstant: A Universal Chart Multimodal Language Model via Chart-to-Table Pre-training and Multitask Instruction Tuning | Jan 4, 2024 | Data VisualizationDecision Making | CodeCode Available | 2 |
| ChatterBox: Multi-round Multimodal Referring and Grounding | Jan 24, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Kani: A Lightweight and Highly Hackable Framework for Building Language Model Applications | Sep 11, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Language Model Cascades | Jul 21, 2022 | Few-Shot LearningLanguage Modeling | CodeCode Available | 2 |
| Cedille: A large autoregressive French language model | Feb 7, 2022 | Few-Shot LearningLanguage Modeling | CodeCode Available | 2 |
| Introducing Visual Perception Token into Multimodal Large Language Model | Feb 24, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model | May 18, 2023 | Image GenerationLanguage Modeling | CodeCode Available | 2 |
| Inference-Time Intervention: Eliciting Truthful Answers from a Language Model | Jun 6, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents | Mar 5, 2024 | BenchmarkingLanguage Modeling | CodeCode Available | 2 |
| Large Language Model Instruction Following: A Survey of Progresses and Challenges | Mar 18, 2023 | Instruction FollowingLanguage Modeling | CodeCode Available | 2 |