| NeedleBench: Can LLMs Do Retrieval and Reasoning in Information-Dense Context? | Jul 16, 2024 | 4k8k | CodeCode Available | 9 |
| YuE: Scaling Open Foundation Models for Long-Form Music Generation | Mar 11, 2025 | FormIn-Context Learning | CodeCode Available | 9 |
| Depth Anything V2 | Jun 13, 2024 | Depth EstimationDiversity | CodeCode Available | 9 |
| LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning | Mar 26, 2024 | GPUGSM8K | CodeCode Available | 9 |
| Visually Descriptive Language Model for Vector Graphics Reasoning | Apr 9, 2024 | DescriptiveLanguage Modeling | CodeCode Available | 9 |
| KAG: Boosting LLMs in Professional Domains via Knowledge Augmented Generation | Sep 10, 2024 | Knowledge GraphsQuestion Answering | CodeCode Available | 9 |
| World Model on Million-Length Video And Language With Blockwise RingAttention | Feb 13, 2024 | 4kVideo Understanding | CodeCode Available | 9 |
| UFO2: The Desktop AgentOS | Apr 20, 2025 | | CodeCode Available | 9 |
| LLM4Decompile: Decompiling Binary Code with Large Language Models | Mar 8, 2024 | HumanEval | CodeCode Available | 9 |
| Do Large Language Models Need a Content Delivery Network? | Sep 16, 2024 | In-Context Learning | CodeCode Available | 9 |
| DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding | Dec 13, 2024 | Chart UnderstandingMixture-of-Experts | CodeCode Available | 9 |
| LatentSync: Audio Conditioned Latent Diffusion Models for Lip Sync | Dec 12, 2024 | Portrait Animation | CodeCode Available | 9 |
| FinRobot: An Open-Source AI Agent Platform for Financial Applications using Large Language Models | May 23, 2024 | AI AgentDecision Making | CodeCode Available | 9 |
| MiniCPM4: Ultra-Efficient LLMs on End Devices | Jun 9, 2025 | Large Language Model | CodeCode Available | 9 |
| Moonshine: Speech Recognition for Live Transcription and Voice Commands | Oct 21, 2024 | DecoderPosition | CodeCode Available | 9 |
| Kodezi Chronos: A Debugging-First Language Model for Repository-Scale, Memory-Driven Code Understanding | Jul 14, 2025 | Code GenerationLanguage Modeling | CodeCode Available | 9 |
| TripoSR: Fast 3D Object Reconstruction from a Single Image | Mar 4, 2024 | 3D Generation3D Object Reconstruction | CodeCode Available | 9 |
| Data-Juicer 2.0: Cloud-Scale Adaptive Data Processing for and with Foundation Models | Dec 23, 2024 | CPU | CodeCode Available | 9 |
| OLMo: Accelerating the Science of Language Models | Feb 1, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 9 |
| MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies | Apr 9, 2024 | Domain Adaptation | CodeCode Available | 9 |
| UltraRAG: A Modular and Automated Toolkit for Adaptive Retrieval-Augmented Generation | Mar 31, 2025 | RAGRetrieval | CodeCode Available | 9 |
| Model Stock: All we need is just a few fine-tuned models | Mar 28, 2024 | All | CodeCode Available | 9 |
| CacheBlend: Fast Large Language Model Serving for RAG with Cached Knowledge Fusion | May 26, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 9 |
| Large Action Models: From Inception to Implementation | Dec 13, 2024 | Action Generation | CodeCode Available | 9 |
| A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications | Mar 10, 2025 | Continual LearningMeta-Learning | CodeCode Available | 9 |
| 2 OLMo 2 Furious | Dec 31, 2024 | | CodeCode Available | 9 |
| LTX-Video: Realtime Video Latent Diffusion | Dec 30, 2024 | DenoisingGPU | CodeCode Available | 9 |
| VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models | Jan 17, 2024 | Text-to-Video GenerationVideo Generation | CodeCode Available | 9 |
| s1: Simple test-time scaling | Jan 31, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 9 |
| FastVLM: Efficient Vision Encoding for Vision Language Models | Dec 17, 2024 | | CodeCode Available | 9 |
| Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data | Jan 19, 2024 | Data AugmentationDepth Estimation | CodeCode Available | 9 |
| Arcee's MergeKit: A Toolkit for Merging Large Language Models | Mar 20, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 9 |
| SkyServe: Serving AI Models across Regions and Clouds with Spot Instances | Nov 3, 2024 | | CodeCode Available | 9 |
| PP-FormulaNet: Bridging Accuracy and Efficiency in Advanced Formula Recognition | Mar 24, 2025 | | CodeCode Available | 9 |
| When Do We Not Need Larger Vision Models? | Mar 19, 2024 | Depth Estimation | CodeCode Available | 9 |
| garak: A Framework for Security Probing Large Language Models | Jun 16, 2024 | Red Teaming | CodeCode Available | 9 |
| LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression | Mar 19, 2024 | GSM8KLanguage Modelling | CodeCode Available | 9 |
| Toward Guidance-Free AR Visual Generation via Condition Contrastive Alignment | Oct 12, 2024 | Language ModellingPhilosophy | CodeCode Available | 9 |
| DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence | Jun 17, 2024 | 16kLanguage Modeling | CodeCode Available | 9 |
| SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory | Nov 18, 2024 | Object TrackingVisual Object Tracking | CodeCode Available | 9 |
| InternLM2 Technical Report | Mar 26, 2024 | 4kLong-Context Understanding | CodeCode Available | 9 |
| DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception | Oct 16, 2024 | Document Layout Analysisdocument understanding | CodeCode Available | 9 |
| PP-DocLayout: A Unified Document Layout Detection Model to Accelerate Large-Scale Data Construction | Mar 21, 2025 | CPUDocument Layout Analysis | CodeCode Available | 9 |
| VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model | Apr 10, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 9 |
| UFO: A UI-Focused Agent for Windows OS Interaction | Feb 8, 2024 | Navigate | CodeCode Available | 9 |
| AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation | Mar 26, 2024 | DiversityFace Reenactment | CodeCode Available | 9 |
| RULER: What's the Real Context Size of Your Long-Context Language Models? | Apr 9, 2024 | Long-Context Understanding | CodeCode Available | 9 |
| MindSearch: Mimicking Human Minds Elicits Deep AI Searcher | Jul 29, 2024 | 2D Semantic Segmentation task 1 (8 classes)graph construction | CodeCode Available | 9 |
| Emilia: A Large-Scale, Extensive, Multilingual, and Diverse Dataset for Speech Generation | Jan 27, 2025 | | CodeCode Available | 9 |
| Overview of the Amphion Toolkit (v0.2) | Jan 26, 2025 | text-to-speechText to Speech | CodeCode Available | 9 |