| Gymnasium: A Standard Interface for Reinforcement Learning Environments | Jul 24, 2024 | reinforcement-learningReinforcement Learning | CodeCode Available | 11 | 5 |
| MaskGCT: Zero-Shot Text-to-Speech with Masked Generative Codec Transformer | Sep 1, 2024 | Self-Supervised Learningtext-to-speech | CodeCode Available | 9 | 5 |
| FinRobot: AI Agent for Equity Research and Valuation with Large Language Models | Nov 13, 2024 | AI Agent | CodeCode Available | 9 | 5 |
| OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on | Mar 4, 2024 | DenoisingImage Generation | CodeCode Available | 9 | 5 |
| Contextual Augmented Multi-Model Programming (CAMP): A Hybrid Local-Cloud Copilot Framework | Oct 20, 2024 | Code CompletionRAG | CodeCode Available | 9 | 5 |
| StableToolBench: Towards Stable Large-Scale Benchmarking on Tool Learning of Large Language Models | Mar 12, 2024 | Benchmarking | CodeCode Available | 9 | 5 |
| Depth Pro: Sharp Monocular Metric Depth in Less Than a Second | Oct 2, 2024 | Depth EstimationGPU | CodeCode Available | 9 | 5 |
| Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction | Apr 3, 2024 | Image GenerationImage Reconstruction | CodeCode Available | 9 | 5 |
| HART: Efficient Visual Generation with Hybrid Autoregressive Transformer | Oct 14, 2024 | Image GenerationImage Reconstruction | CodeCode Available | 9 | 5 |
| Sapiens: Foundation for Human Vision Models | Aug 22, 2024 | 2D Human Pose Estimation2D Pose Estimation | CodeCode Available | 9 | 5 |
| SkyReels-V2: Infinite-length Film Generative Model | Apr 17, 2025 | Large Language Modelmodel | CodeCode Available | 9 | 5 |
| DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models | Feb 5, 2024 | Arithmetic ReasoningMath | CodeCode Available | 9 | 5 |
| DeepSeek LLM: Scaling Open-Source Language Models with Longtermism | Jan 5, 2024 | | CodeCode Available | 9 | 5 |
| SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer | Jan 30, 2025 | Image GenerationModel Compression | CodeCode Available | 9 | 5 |
| DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model | May 7, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 9 | 5 |
| Language agents achieve superhuman synthesis of scientific knowledge | Sep 10, 2024 | ArticlesInformation Retrieval | CodeCode Available | 9 | 5 |
| TorchTitan: One-stop PyTorch native solution for production ready LLM pre-training | Oct 9, 2024 | GPU | CodeCode Available | 9 | 5 |
| Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion | Jul 1, 2024 | Decision MakingPrediction | CodeCode Available | 9 | 5 |
| Liger Kernel: Efficient Triton Kernels for LLM Training | Oct 14, 2024 | ChunkingGPU | CodeCode Available | 9 | 5 |
| CogVLM2: Visual Language Models for Image and Video Understanding | Aug 29, 2024 | MM-VetMVBench | CodeCode Available | 9 | 5 |
| SuperSimpleNet: Unifying Unsupervised and Supervised Learning for Fast and Reliable Surface Defect Detection | Aug 6, 2024 | Anomaly DetectionDefect Detection | CodeCode Available | 9 | 5 |
| Grounded SAM: Assembling Open-World Models for Diverse Visual Tasks | Jan 25, 2024 | Segmentation | CodeCode Available | 9 | 5 |
| StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation | May 2, 2024 | motion predictionStory Generation | CodeCode Available | 9 | 5 |
| MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention | Jul 2, 2024 | GPULanguage Modelling | CodeCode Available | 9 | 5 |
| ORPO: Monolithic Preference Optimization without Reference Model | Mar 12, 2024 | model | CodeCode Available | 9 | 5 |
| FlashInfer: Efficient and Customizable Attention Engine for LLM Inference Serving | Jan 2, 2025 | GPUScheduling | CodeCode Available | 9 | 5 |
| Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration | Jun 3, 2024 | | CodeCode Available | 9 | 5 |
| Symbolic Learning Enables Self-Evolving Agents | Jun 26, 2024 | | CodeCode Available | 9 | 5 |
| Aviary: training language agents on challenging scientific tasks | Dec 30, 2024 | | CodeCode Available | 9 | 5 |
| Enhancing Investment Analysis: Optimizing AI-Agent Collaboration in Financial Research | Nov 7, 2024 | AI AgentDecision Making | CodeCode Available | 9 | 5 |
| Metis: A Foundation Speech Generation Model with Masked Generative Pre-training | Feb 5, 2025 | Self-Supervised LearningSpeech Enhancement | CodeCode Available | 9 | 5 |
| Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting | May 20, 2025 | | CodeCode Available | 9 | 5 |
| CMMMU: A Chinese Massive Multi-discipline Multimodal Understanding Benchmark | Jan 22, 2024 | | CodeCode Available | 9 | 5 |
| YOLO-World: Real-Time Open-Vocabulary Object Detection | Jan 30, 2024 | Instance SegmentationLanguage Modeling | CodeCode Available | 9 | 5 |
| Yi: Open Foundation Models by 01.AI | Mar 7, 2024 | AttributeChatbot | CodeCode Available | 9 | 5 |
| Steering Language Models with Game-Theoretic Solvers | Jan 24, 2024 | Imitation LearningScheduling | CodeCode Available | 9 | 5 |
| VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild | Mar 25, 2024 | DecoderLanguage Modeling | CodeCode Available | 9 | 5 |
| (Perhaps) Beyond Human Translation: Harnessing Multi-Agent Collaboration for Translating Ultra-Long Literary Texts | May 20, 2024 | Machine TranslationTranslation | CodeCode Available | 9 | 5 |
| LawGPT: A Chinese Legal Knowledge-Enhanced Large Language Model | Jun 7, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 9 | 5 |
| BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack | Jun 14, 2024 | Question AnsweringRetrieval-augmented Generation | CodeCode Available | 9 | 5 |
| NeedleBench: Can LLMs Do Retrieval and Reasoning in Information-Dense Context? | Jul 16, 2024 | 4k8k | CodeCode Available | 9 | 5 |
| YuE: Scaling Open Foundation Models for Long-Form Music Generation | Mar 11, 2025 | FormIn-Context Learning | CodeCode Available | 9 | 5 |
| Depth Anything V2 | Jun 13, 2024 | Depth EstimationDiversity | CodeCode Available | 9 | 5 |
| LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning | Mar 26, 2024 | GPUGSM8K | CodeCode Available | 9 | 5 |
| Visually Descriptive Language Model for Vector Graphics Reasoning | Apr 9, 2024 | DescriptiveLanguage Modeling | CodeCode Available | 9 | 5 |
| KAG: Boosting LLMs in Professional Domains via Knowledge Augmented Generation | Sep 10, 2024 | Knowledge GraphsQuestion Answering | CodeCode Available | 9 | 5 |
| World Model on Million-Length Video And Language With Blockwise RingAttention | Feb 13, 2024 | 4kVideo Understanding | CodeCode Available | 9 | 5 |
| UFO2: The Desktop AgentOS | Apr 20, 2025 | | CodeCode Available | 9 | 5 |
| LLM4Decompile: Decompiling Binary Code with Large Language Models | Mar 8, 2024 | HumanEval | CodeCode Available | 9 | 5 |
| Do Large Language Models Need a Content Delivery Network? | Sep 16, 2024 | In-Context Learning | CodeCode Available | 9 | 5 |