| Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models | Mar 12, 2025 | DenoisingLanguage Modeling | CodeCode Available | 4 |
| R1-Onevision:An Open-Source Multimodal Large Language Model Capable of Deep Reasoning | Feb 24, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates | Feb 10, 2025 | Hierarchical Reinforcement LearningLanguage Modeling | CodeCode Available | 4 |
| Steel-LLM:From Scratch to Open Source -- A Personal Journey in Building a Chinese-Centric LLM | Feb 10, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach | Feb 7, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of Large Language Models | Jan 31, 2025 | Caption GenerationLanguage Modeling | CodeCode Available | 4 |
| Beyond Reward Hacking: Causal Rewards for Large Language Model Alignment | Jan 16, 2025 | Causal Inferencecounterfactual | CodeCode Available | 4 |
| Tarsier2: Advancing Large Vision-Language Models from Detailed Video Description to Comprehensive Video Understanding | Jan 14, 2025 | Embodied Question AnsweringHallucination | CodeCode Available | 4 |
| Training Software Engineering Agents and Verifiers with SWE-Gym | Dec 30, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| LLM4AD: A Platform for Algorithm Design with Large Language Model | Dec 23, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator | Dec 16, 2024 | GSM8KLanguage Modeling | CodeCode Available | 4 |
| Gated Delta Networks: Improving Mamba2 with Delta Rule | Dec 9, 2024 | Common Sense ReasoningLanguage Modeling | CodeCode Available | 4 |
| Liquid: Language Models are Scalable Multi-modal Generators | Dec 5, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| LLM2CLIP: Powerful Language Model Unlocks Richer Visual Representation | Nov 7, 2024 | Contrastive LearningImage Captioning | CodeCode Available | 4 |
| MutaPLM: Protein Language Modeling for Mutation Explanation and Engineering | Oct 30, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| SNAC: Multi-Scale Neural Audio Codec | Oct 18, 2024 | Audio CompressionAudio Generation | CodeCode Available | 4 |
| Choices are More Important than Efforts: LLM Enables Efficient Multi-Agent Exploration | Oct 3, 2024 | DiversityLanguage Modeling | CodeCode Available | 4 |
| Data-Prep-Kit: getting your data ready for LLM application development | Sep 26, 2024 | CPULanguage Modeling | CodeCode Available | 4 |
| Video-XL: Extra-Long Vision Language Model for Hour-Scale Video Understanding | Sep 22, 2024 | Anomaly DetectionGPU | CodeCode Available | 4 |
| Large Language Model-Based Agents for Software Engineering: A Survey | Sep 4, 2024 | AI AgentLanguage Modeling | CodeCode Available | 4 |
| OLMoE: Open Mixture-of-Experts Language Models | Sep 3, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search | Aug 15, 2024 | Automated Theorem ProvingLanguage Modeling | CodeCode Available | 4 |
| Medical Graph RAG: Towards Safe Medical Large Language Model via Graph Retrieval-Augmented Generation | Aug 8, 2024 | ChunkingFact Checking | CodeCode Available | 4 |
| The Llama 3 Herd of Models | Jul 31, 2024 | answerability predictionLanguage Modeling | CodeCode Available | 4 |
| When AI Meets Finance (StockAgent): Large Language Model-based Stock Trading in Simulated Real-world Environments | Jul 15, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| SEED-Story: Multimodal Long Story Generation with Large Language Model | Jul 11, 2024 | Image GenerationLanguage Modeling | CodeCode Available | 4 |
| YuLan: An Open-source Large Language Model | Jun 28, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| RaTEScore: A Metric for Radiology Report Generation | Jun 24, 2024 | DiagnosticEntity Embeddings | CodeCode Available | 4 |
| Long Context Transfer from Language to Vision | Jun 24, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs | Jun 14, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Simple and Effective Masked Diffusion Language Models | Jun 11, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling | Jun 11, 2024 | 4kLanguage Modeling | CodeCode Available | 4 |
| AgentGym: Evolving Large Language Model-based Agents across Diverse Environments | Jun 6, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models | Jun 3, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series | May 29, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| AutoCoder: Enhancing Code Large Language Model with AIEV-Instruct | May 23, 2024 | Class-level Code GenerationCode Completion | CodeCode Available | 4 |
| LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit | May 9, 2024 | BenchmarkingComputational Efficiency | CodeCode Available | 4 |
| SEED-Data-Edit Technical Report: A Hybrid Dataset for Instructional Image Editing | May 7, 2024 | Image ManipulationLanguage Modeling | CodeCode Available | 4 |
| Self-Play Preference Optimization for Language Model Alignment | May 1, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models | Apr 19, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Photo-Realistic Image Restoration in the Wild with Controlled Vision-Language Models | Apr 15, 2024 | Image GenerationImage Restoration | CodeCode Available | 4 |
| Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention | Apr 10, 2024 | Book summarizationLanguage Modeling | CodeCode Available | 4 |
| AutoWebGLM: A Large Language Model-based Web Navigating Agent | Apr 4, 2024 | Decision MakingLanguage Modeling | CodeCode Available | 4 |
| MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with Interleaved Visual-Textual Tokens | Apr 4, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Sailor: Open Language Models for South-East Asia | Apr 4, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| A Survey on Large Language Model-Based Game Agents | Apr 2, 2024 | Decision MakingLanguage Modeling | CodeCode Available | 4 |
| BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text | Mar 27, 2024 | ArticlesLanguage Modeling | CodeCode Available | 4 |
| RewardBench: Evaluating Reward Models for Language Modeling | Mar 20, 2024 | Instruction FollowingLanguage Modeling | CodeCode Available | 4 |
| UniTable: Towards a Unified Framework for Table Recognition via Self-Supervised Pretraining | Mar 7, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |
| Tower: An Open Multilingual Large Language Model for Translation-Related Tasks | Feb 27, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 4 |