| Large Language Model Agent: A Survey on Methodology, Applications and Challenges | Mar 27, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 7 | 5 |
| FastSwitch: Optimizing Context Switching Efficiency in Fairness-aware Large Language Model Serving | Nov 27, 2024 | FairnessGPU | CodeCode Available | 7 | 5 |
| From Bytes to Ideas: Language Modeling with Autoregressive U-Nets | Jun 17, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 7 | 5 |
| MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning | Oct 14, 2023 | Image ClassificationImage Description | CodeCode Available | 7 | 5 |
| MagicQuill: An Intelligent Interactive Image Editing System | Nov 14, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 7 | 5 |
| EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty | Jan 26, 2024 | Code GenerationInstruction Following | CodeCode Available | 7 | 5 |
| Skywork R1V: Pioneering Multimodal Reasoning with Chain-of-Thought | Apr 8, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 7 | 5 |
| Large Concept Models: Language Modeling in a Sentence Representation Space | Dec 11, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 7 | 5 |
| Simulating 500 million years of evolution with a language model | Dec 31, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 7 | 5 |
| Scaling Speech-Text Pre-training with Synthetic Interleaved Data | Nov 26, 2024 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | CodeCode Available | 7 | 5 |
| Dynamic data sampler for cross-language transfer learning in large language models | May 17, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 7 | 5 |
| Chinese-Vicuna: A Chinese Instruction-following Llama-based Model | Apr 17, 2025 | Code GenerationCPU | CodeCode Available | 7 | 5 |
| Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model | Jun 10, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 7 | 5 |
| Tulu 3: Pushing Frontiers in Open Language Model Post-Training | Nov 22, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 7 | 5 |
| Scalable MatMul-free Language Modeling | Jun 4, 2024 | GPULanguage Modeling | CodeCode Available | 7 | 5 |
| SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models | Feb 8, 2024 | BenchmarkingDiversity | CodeCode Available | 7 | 5 |
| Elixir: Train a Large Language Model on a Small GPU Cluster | Dec 10, 2022 | CPUGPU | CodeCode Available | 7 | 5 |
| Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers | Jan 5, 2023 | In-Context LearningLanguage Modeling | CodeCode Available | 7 | 5 |
| aiXcoder-7B: A Lightweight and Effective Large Language Model for Code Processing | Oct 17, 2024 | AttributeCode Completion | CodeCode Available | 7 | 5 |
| Efficient Memory Management for Large Language Model Serving with PagedAttention | Sep 12, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 6 | 5 |
| A Watermark for Large Language Models | Jan 24, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 6 | 5 |
| Gorilla: Large Language Model Connected with Massive APIs | May 24, 2023 | HallucinationLanguage Modeling | CodeCode Available | 6 | 5 |
| GLM-130B: An Open Bilingual Pre-trained Model | Oct 5, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 6 | 5 |
| NEFTune: Noisy Embeddings Improve Instruction Finetuning | Oct 9, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 6 | 5 |
| AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration | Jun 1, 2023 | Autonomous DrivingCloud Computing | CodeCode Available | 6 | 5 |
| FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning | Jul 17, 2023 | GPULanguage Modeling | CodeCode Available | 6 | 5 |
| A Survey of Large Language Models | Mar 31, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 6 | 5 |
| FinGPT: Open-Source Financial Large Language Models | Jun 9, 2023 | Algorithmic TradingLanguage Modeling | CodeCode Available | 6 | 5 |
| Mamba: Linear-Time Sequence Modeling with Selective State Spaces | Dec 1, 2023 | 2D Pose EstimationCommon Sense Reasoning | CodeCode Available | 6 | 5 |
| Extending Context Window of Large Language Models via Positional Interpolation | Jun 27, 2023 | Document SummarizationLanguage Modeling | CodeCode Available | 6 | 5 |
| Chain-of-Thought Prompting Elicits Reasoning in Large Language Models | Jan 28, 2022 | Common Sense ReasoningGSM8K | CodeCode Available | 6 | 5 |
| ERNIE-Code: Beyond English-Centric Cross-lingual Pretraining for Programming Languages | Dec 13, 2022 | Code SummarizationLanguage Modeling | CodeCode Available | 6 | 5 |
| SGLang: Efficient Execution of Structured Language Model Programs | Dec 12, 2023 | Few-Shot LearningLanguage Modeling | CodeCode Available | 6 | 5 |
| Simple and Controllable Music Generation | Jun 8, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 6 | 5 |
| CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis | Mar 25, 2022 | Code GenerationHumanEval | CodeCode Available | 6 | 5 |
| Direct Preference Optimization: Your Language Model is Secretly a Reward Model | May 29, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 6 | 5 |
| Qwen Technical Report | Sep 28, 2023 | Language ModelingLanguage Modelling | CodeCode Available | 6 | 5 |
| CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society | Mar 31, 2023 | Instruction FollowingLanguage Modeling | CodeCode Available | 6 | 5 |
| Large Multilingual Models Pivot Zero-Shot Multimodal Learning across Languages | Aug 23, 2023 | Image GenerationImage to text | CodeCode Available | 6 | 5 |
| Mistral 7B | Oct 10, 2023 | answerability predictionArithmetic Reasoning | CodeCode Available | 6 | 5 |
| MobileVLM V2: Faster and Stronger Baseline for Vision Language Model | Feb 6, 2024 | AutoMLLanguage Modeling | CodeCode Available | 5 | 5 |
| Ovis: Structural Embedding Alignment for Multimodal Large Language Model | May 31, 2024 | Language ModelingMultimodal Large Language Model | CodeCode Available | 5 | 5 |
| InstructPix2Pix: Learning to Follow Image Editing Instructions | Nov 17, 2022 | Image Editing | CodeCode Available | 5 | 5 |
| Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities | Feb 2, 2024 | Acoustic Scene ClassificationAudio captioning | CodeCode Available | 5 | 5 |
| CogVLM: Visual Expert for Pretrained Language Models | Nov 6, 2023 | 1 Image, 2*2 StitchingFS-MEVQA | CodeCode Available | 5 | 5 |
| MEIA: Multimodal Embodied Perception and Interaction in Unknown Environments | Feb 1, 2024 | Embodied Question AnsweringLanguage Modeling | CodeCode Available | 5 | 5 |
| HealthGPT: A Medical Large Vision-Language Model for Unifying Comprehension and Generation via Heterogeneous Knowledge Adaptation | Feb 14, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 5 | 5 |
| NotaGen: Advancing Musicality in Symbolic Music Generation with Large Language Model Training Paradigms | Feb 25, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 5 | 5 |
| MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs | Feb 23, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 5 | 5 |
| FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning | Feb 29, 2024 | GPULanguage Modeling | CodeCode Available | 5 | 5 |