| LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression | Mar 19, 2024 | GSM8KLanguage Modelling | CodeCode Available | 9 |
| StableToolBench: Towards Stable Large-Scale Benchmarking on Tool Learning of Large Language Models | Mar 12, 2024 | Benchmarking | CodeCode Available | 9 |
| ORPO: Monolithic Preference Optimization without Reference Model | Mar 12, 2024 | model | CodeCode Available | 9 |
| LLM4Decompile: Decompiling Binary Code with Large Language Models | Mar 8, 2024 | HumanEval | CodeCode Available | 9 |
| Divide and Conquer: High-Resolution Industrial Anomaly Detection via Memory Efficient Tiled Ensemble | Mar 7, 2024 | Anomaly DetectionGPU | CodeCode Available | 9 |
| Yi: Open Foundation Models by 01.AI | Mar 7, 2024 | AttributeChatbot | CodeCode Available | 9 |
| OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on | Mar 4, 2024 | DenoisingImage Generation | CodeCode Available | 9 |
| TripoSR: Fast 3D Object Reconstruction from a Single Image | Mar 4, 2024 | 3D Generation3D Object Reconstruction | CodeCode Available | 9 |
| World Model on Million-Length Video And Language With Blockwise RingAttention | Feb 13, 2024 | 4kVideo Understanding | CodeCode Available | 9 |
| UFO: A UI-Focused Agent for Windows OS Interaction | Feb 8, 2024 | Navigate | CodeCode Available | 9 |
| DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models | Feb 5, 2024 | Arithmetic ReasoningMath | CodeCode Available | 9 |
| Natural language guidance of high-fidelity text-to-speech with synthetic annotations | Feb 2, 2024 | In-Context LearningLanguage Modeling | CodeCode Available | 9 |
| OLMo: Accelerating the Science of Language Models | Feb 1, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 9 |
| YOLO-World: Real-Time Open-Vocabulary Object Detection | Jan 30, 2024 | Instance SegmentationLanguage Modeling | CodeCode Available | 9 |
| Grounded SAM: Assembling Open-World Models for Diverse Visual Tasks | Jan 25, 2024 | Segmentation | CodeCode Available | 9 |
| Steering Language Models with Game-Theoretic Solvers | Jan 24, 2024 | Imitation LearningScheduling | CodeCode Available | 9 |
| CMMMU: A Chinese Massive Multi-discipline Multimodal Understanding Benchmark | Jan 22, 2024 | | CodeCode Available | 9 |
| Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data | Jan 19, 2024 | Data AugmentationDepth Estimation | CodeCode Available | 9 |
| VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models | Jan 17, 2024 | Text-to-Video GenerationVideo Generation | CodeCode Available | 9 |
| DeepSeek LLM: Scaling Open-Source Language Models with Longtermism | Jan 5, 2024 | | CodeCode Available | 9 |
| Perception Encoder: The best visual embeddings are not at the output of the network | Apr 17, 2025 | Depth EstimationLanguage Modeling | CodeCode Available | 8 |
| GPT4All: An Ecosystem of Open Source Compressed Language Models | Nov 6, 2023 | | CodeCode Available | 8 |
| Llama 2: Open Foundation and Fine-Tuned Chat Models | Jul 18, 2023 | Arithmetic Reasoning | CodeCode Available | 8 |
| Adapting Large Language Model with Speech for Fully Formatted End-to-End Speech Recognition | Jul 17, 2023 | DecoderLanguage Modeling | CodeCode Available | 8 |
| DETRs Beat YOLOs on Real-time Object Detection | Apr 17, 2023 | 2D Object DetectionDecoder | CodeCode Available | 8 |
| Robust Speech Recognition via Large-Scale Weak Supervision | Dec 6, 2022 | Robust Speech Recognitionspeech-recognition | CodeCode Available | 8 |
| Fine-mixing: Mitigating Backdoors in Fine-tuned Language Models | Oct 18, 2022 | Language ModellingSentence | CodeCode Available | 8 |
| DocLayNet: A Large Human-Annotated Dataset for Document-Layout Analysis | Jun 2, 2022 | Document Layout AnalysisObject Detection | CodeCode Available | 8 |
| Attention Residuals | Mar 16, 2026 | | —Unverified | 7 |
| WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning | Mar 12, 2026 | | —Unverified | 7 |
| Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem | Mar 12, 2026 | | —Unverified | 7 |
| Pretraining Large Language Models with NVFP4 | Mar 4, 2026 | | —Unverified | 7 |
| dLLM: Simple Diffusion Language Modeling | Feb 26, 2026 | | —Unverified | 7 |
| GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning | Feb 26, 2026 | | —Unverified | 7 |
| SAM 3D Body: Robust Full-Body Human Mesh Recovery | Feb 17, 2026 | | —Unverified | 7 |
| Qwen3-ASR Technical Report | Jan 30, 2026 | | —Unverified | 7 |
| Advancing Open-source World Models | Jan 28, 2026 | | —Unverified | 7 |
| Is Diversity All You Need for Scalable Robotic Manipulation? | Jul 8, 2025 | AllDiversity | CodeCode Available | 7 |
| Skywork-R1V3 Technical Report | Jul 8, 2025 | cross-modal alignmentMathematical Reasoning | CodeCode Available | 7 |
| EvoAgentX: An Automated Framework for Evolving Agentic Workflows | Jul 4, 2025 | Code GenerationMath | CodeCode Available | 7 |
| GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning | Jul 1, 2025 | document understandingMultimodal Reasoning | CodeCode Available | 7 |
| OmniGen2: Exploration to Advanced Multimodal Generation | Jun 23, 2025 | Image Generationmultimodal generation | CodeCode Available | 7 |
| From Bytes to Ideas: Language Modeling with Autoregressive U-Nets | Jun 17, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 7 |
| MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention | Jun 16, 2025 | Mixture-of-ExpertsReinforcement Learning (RL) | CodeCode Available | 7 |
| AgentOrchestra: A Hierarchical Multi-Agent Framework for General-Purpose Task Solving | Jun 14, 2025 | | CodeCode Available | 7 |
| ComfyUI-R1: Exploring Reasoning Models for Workflow Generation | Jun 11, 2025 | 4k | CodeCode Available | 7 |
| V-JEPA 2: Self-Supervised Video Models Enable Understanding, Prediction and Planning | Jun 11, 2025 | Action AnticipationLarge Language Model | CodeCode Available | 7 |
| Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model | Jun 10, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 7 |
| Reinforcement Learning Optimization for Large-Scale Learning: An Efficient and User-Friendly Scaling Library | Jun 6, 2025 | Management | CodeCode Available | 7 |
| MMSU: A Massive Multi-task Spoken Language Understanding and Reasoning Benchmark | Jun 5, 2025 | RhythmSpoken Language Understanding | CodeCode Available | 7 |