| FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision | Jul 11, 2024 | GPUQuantization | CodeCode Available | 11 | 5 |
| WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model | Nov 26, 2024 | | CodeCode Available | 11 | 5 |
| Demonstration of DB-GPT: Next Generation Data Interaction System Empowered by Large Language Models | Apr 16, 2024 | Data InteractionText to SQL | CodeCode Available | 11 | 5 |
| ROMAS: A Role-Based Multi-Agent System for Database monitoring and Planning | Dec 18, 2024 | | CodeCode Available | 11 | 5 |
| Agent S: An Open Agentic Framework that Uses Computers Like a Human | Oct 10, 2024 | AI AgentTask Planning | CodeCode Available | 11 | 5 |
| The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery | Aug 12, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 11 | 5 |
| WebLLM: A High-Performance In-Browser LLM Inference Engine | Dec 20, 2024 | CPUGPU | CodeCode Available | 11 | 5 |
| Deep Time Series Models: A Comprehensive Survey and Benchmark | Jul 18, 2024 | SurveyTime Series | CodeCode Available | 11 | 5 |
| Janus-Pro: Unified Multimodal Understanding and Generation with Data and Model Scaling | Jan 29, 2025 | Image Generation | CodeCode Available | 11 | 5 |
| LivePortrait: Efficient Portrait Animation with Stitching and Retargeting Control | Jul 3, 2024 | Computational EfficiencyFace Reenactment | CodeCode Available | 11 | 5 |
| OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation | May 29, 2025 | Large Language Model | CodeCode Available | 11 | 5 |
| Wan: Open and Advanced Large-Scale Video Generative Models | Mar 26, 2025 | Video EditingVideo Generation | CodeCode Available | 11 | 5 |
| Introduction to Reinforcement Learning | Aug 13, 2024 | reinforcement-learningReinforcement Learning | CodeCode Available | 11 | 5 |
| Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation | Jan 21, 2025 | Texture Synthesis | CodeCode Available | 11 | 5 |
| SCORE: Systematic COnsistency and Robustness Evaluation for Large Language Models | Feb 28, 2025 | MMLU | CodeCode Available | 11 | 5 |
| Attentive Reasoning Queries: A Systematic Method for Optimizing Instruction-Following in Large Language Models | Mar 5, 2025 | HallucinationInstruction Following | CodeCode Available | 11 | 5 |
| CosyVoice 2: Scalable Streaming Speech Synthesis with Large Language Models | Dec 13, 2024 | In-Context LearningQuantization | CodeCode Available | 11 | 5 |
| FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs | Jul 4, 2024 | Emotion RecognitionEvent Detection | CodeCode Available | 11 | 5 |
| Data Formulator 2: Iterative Creation of Data Visualizations, with AI Transforming Data Along the Way | Aug 28, 2024 | Code GenerationNavigate | CodeCode Available | 11 | 5 |
| WebDancer: Towards Autonomous Information Seeking Agency | May 28, 2025 | | CodeCode Available | 11 | 5 |
| YOLOE: Real-Time Seeing Anything | Mar 10, 2025 | 10-shot image generation | CodeCode Available | 11 | 5 |
| VGGT: Visual Geometry Grounded Transformer | Mar 14, 2025 | Depth EstimationNovel View Synthesis | CodeCode Available | 11 | 5 |
| SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics | Jun 2, 2025 | Action GenerationGPU | CodeCode Available | 11 | 5 |
| Bridging The Gap between Low-rank and Orthogonal Adaptation via Householder Reflection Adaptation | May 24, 2024 | | CodeCode Available | 11 | 5 |
| Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality | May 31, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 11 | 5 |
| Packing Input Frame Context in Next-Frame Prediction Models for Video Generation | Apr 17, 2025 | | CodeCode Available | 11 | 5 |
| Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens | Mar 3, 2025 | Attributetext-to-speech | CodeCode Available | 11 | 5 |
| olmOCR: Unlocking Trillions of Tokens in PDFs with Vision Language Models | Feb 25, 2025 | DiversityLanguage Modeling | CodeCode Available | 11 | 5 |
| Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents | Apr 1, 2025 | AI AgentTask Planning | CodeCode Available | 11 | 5 |
| TinyLlama: An Open-Source Small Language Model | Jan 4, 2024 | Computational EfficiencyLanguage Modeling | CodeCode Available | 11 | 5 |
| Open-Sora Plan: Open-Source Large Video Generation Model | Nov 28, 2024 | Video Generation | CodeCode Available | 11 | 5 |
| YOLOv10: Real-Time End-to-End Object Detection | May 23, 2024 | 2D Object DetectionData Augmentation | CodeCode Available | 11 | 5 |
| Ditto: Motion-Space Diffusion for Controllable Realtime Talking Head Synthesis | Nov 29, 2024 | DisentanglementMotion Generation | CodeCode Available | 11 | 5 |
| Very Large-Scale Multi-Agent Simulation in AgentScope | Jul 25, 2024 | | CodeCode Available | 11 | 5 |
| CosyVoice: A Scalable Multilingual Zero-shot Text-to-speech Synthesizer based on Supervised Semantic Tokens | Jul 7, 2024 | Language ModellingLarge Language Model | CodeCode Available | 11 | 5 |
| JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and Generation | Nov 12, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 11 | 5 |
| WebSailor: Navigating Super-human Reasoning for Web Agent | Jul 3, 2025 | | CodeCode Available | 11 | 5 |
| Magika: AI-Powered Content-Type Detection | Sep 18, 2024 | CPUMalware Analysis | CodeCode Available | 11 | 5 |
| IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System | Feb 8, 2025 | DecoderLanguage Modeling | CodeCode Available | 11 | 5 |
| Scaling Synthetic Data Creation with 1,000,000,000 Personas | Jun 28, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 11 | 5 |
| AutoDev: Automated AI-Driven Development | Mar 13, 2024 | Code GenerationHumanEval | CodeCode Available | 11 | 5 |
| SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering | May 6, 2024 | Bug fixingLanguage Modeling | CodeCode Available | 11 | 5 |
| HybridFlow: A Flexible and Efficient RLHF Framework | Sep 28, 2024 | Large Language Model | CodeCode Available | 11 | 5 |
| DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence | Jan 25, 2024 | Code GenerationLanguage Modeling | CodeCode Available | 11 | 5 |
| Qwen2.5-Coder Technical Report | Sep 18, 2024 | Code Generation | CodeCode Available | 11 | 5 |
| EAP4EMSIG -- Experiment Automation Pipeline for Event-Driven Microscopy to Smart Microfluidic Single-Cells Analysis | Nov 6, 2024 | | CodeCode Available | 11 | 5 |
| AgentScope: A Flexible yet Robust Multi-Agent Platform | Feb 21, 2024 | Multi-agent Integration | CodeCode Available | 11 | 5 |
| NYU CTF Bench: A Scalable Open-Source Benchmark Dataset for Evaluating LLMs in Offensive Security | Jun 8, 2024 | Task PlanningVulnerability Detection | CodeCode Available | 11 | 5 |
| WebWalker: Benchmarking LLMs in Web Traversal | Jan 13, 2025 | BenchmarkingOpen-Domain Question Answering | CodeCode Available | 11 | 5 |
| SAM 2: Segment Anything in Images and Videos | Aug 1, 2024 | Image SegmentationRobot Manipulation Generalization | CodeCode Available | 11 | 5 |