| Robust Inverse Graphics via Probabilistic Inference | Feb 2, 2024 | NeRF | CodeCode Available | 7 |
| LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters! | Feb 11, 2025 | Large Language ModelMath | CodeCode Available | 7 |
| From Bytes to Ideas: Language Modeling with Autoregressive U-Nets | Jun 17, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 7 |
| Hallo2: Long-Duration and High-Resolution Audio-Driven Portrait Image Animation | Oct 10, 2024 | 4kImage Animation | CodeCode Available | 7 |
| PIKE-RAG: sPecIalized KnowledgE and Rationale Augmented Generation | Jan 20, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 7 |
| 2D Gaussian Splatting for Geometrically Accurate Radiance Fields | Mar 26, 2024 | 3DGSNovel View Synthesis | CodeCode Available | 7 |
| MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models | Apr 20, 2023 | Image DescriptionLanguage Modelling | CodeCode Available | 7 |
| MoE-LLaVA: Mixture of Experts for Large Vision-Language Models | Jan 29, 2024 | HallucinationMixture-of-Experts | CodeCode Available | 7 |
| LHM: Large Animatable Human Reconstruction Model from a Single Image in Seconds | Mar 13, 2025 | 3D Human Reconstruction | CodeCode Available | 7 |
| Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model | Feb 14, 2025 | Video GenerationVideo Reconstruction | CodeCode Available | 7 |
| Dynamic data sampler for cross-language transfer learning in large language models | May 17, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 7 |
| CALE: Continuous Arcade Learning Environment | Oct 31, 2024 | Atari GamesBenchmarking | CodeCode Available | 7 |
| LLaMA: Open and Efficient Foundation Language Models | Feb 27, 2023 | Arithmetic ReasoningCode Generation | CodeCode Available | 7 |
| FourierKAN outperforms MLP on Text Classification Head Fine-tuning | Aug 16, 2024 | ClassificationKolmogorov-Arnold Networks | CodeCode Available | 7 |
| Prometheus: Inducing Fine-grained Evaluation Capability in Language Models | Oct 12, 2023 | Language ModellingLarge Language Model | CodeCode Available | 7 |
| Domain Expansion of Image Generators | Jan 12, 2023 | | CodeCode Available | 7 |
| OmniGen: Unified Image Generation | Sep 17, 2024 | Edge DetectionImage Generation | CodeCode Available | 7 |
| Fast Timing-Conditioned Latent Audio Diffusion | Feb 7, 2024 | Audio GenerationGPU | CodeCode Available | 7 |
| Segment Any Text: A Universal Approach for Robust, Efficient and Adaptable Sentence Segmentation | Jun 24, 2024 | parameter-efficient fine-tuningSentence | CodeCode Available | 7 |
| PuLID: Pure and Lightning ID Customization via Contrastive Alignment | Apr 24, 2024 | Image GenerationText to Image Generation | CodeCode Available | 7 |
| Byte Latent Transformer: Patches Scale Better Than Tokens | Dec 13, 2024 | | CodeCode Available | 7 |
| EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation | Nov 15, 2024 | Audio-Driven Body AnimationHuman Animation | CodeCode Available | 7 |
| OmniGen2: Exploration to Advanced Multimodal Generation | Jun 23, 2025 | Image Generationmultimodal generation | CodeCode Available | 7 |
| xDiT: an Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism | Nov 4, 2024 | GPU | CodeCode Available | 7 |
| Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance | Mar 21, 2024 | Animated GIF GenerationImage Animation | CodeCode Available | 7 |
| GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning | Jul 1, 2025 | document understandingMultimodal Reasoning | CodeCode Available | 7 |
| Gravity-aligned Rotation Averaging with Circular Regression | Oct 16, 2024 | Mixed Realityregression | CodeCode Available | 7 |
| Full Scaling Automation for Sustainable Development of Green Data Centers | May 1, 2023 | Cloud ComputingCPU | CodeCode Available | 7 |
| Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning | Mar 12, 2025 | Question AnsweringRAG | CodeCode Available | 7 |
| HiDream-I1: A High-Efficient Image Generative Foundation Model with Sparse Diffusion Transformer | May 28, 2025 | Image GenerationMixture-of-Experts | CodeCode Available | 7 |
| LLM Post-Training: A Deep Dive into Reasoning Large Language Models | Feb 28, 2025 | | CodeCode Available | 7 |
| Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining | Aug 5, 2024 | DecoderDepth Estimation | CodeCode Available | 7 |
| LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset | Sep 21, 2023 | ChatbotDiversity | CodeCode Available | 7 |
| Skywork R1V2: Multimodal Hybrid Reinforcement Learning for Reasoning | Apr 23, 2025 | Multimodal Reasoningreinforcement-learning | CodeCode Available | 7 |
| MMAU: A Holistic Benchmark of Agent Capabilities Across Diverse Domains | Jul 18, 2024 | | CodeCode Available | 7 |
| xLSTM 7B: A Recurrent LLM for Fast and Efficient Inference | Mar 17, 2025 | MambaMath | CodeCode Available | 7 |
| SageAttention2++: A More Efficient Implementation of SageAttention2 | May 27, 2025 | QuantizationVideo Generation | CodeCode Available | 7 |
| NAVSIM: Data-Driven Non-Reactive Autonomous Vehicle Simulation and Benchmarking | Jun 21, 2024 | Autonomous DrivingBenchmarking | CodeCode Available | 7 |
| Bridging Evolutionary Multiobjective Optimization and GPU Acceleration via Tensorization | Mar 26, 2025 | CPUGPU | CodeCode Available | 7 |
| PowerPM: Foundation Model for Power Systems | Aug 7, 2024 | Contrastive Learningmodel | CodeCode Available | 7 |
| SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration | Oct 3, 2024 | Image GenerationQuantization | CodeCode Available | 7 |
| Open Deep Search: Democratizing Search with Open-source Reasoning Agents | Mar 26, 2025 | 10-shot image generation | CodeCode Available | 7 |
| TextGrad: Automatic "Differentiation" via Text | Jun 11, 2024 | Question AnsweringSpecificity | CodeCode Available | 7 |
| X-MeshGraphNet: Scalable Multi-Scale Graph Neural Networks for Physics Simulation | Nov 26, 2024 | | CodeCode Available | 7 |
| ViDoRe Benchmark V2: Raising the Bar for Visual Retrieval | May 22, 2025 | Retrieval | CodeCode Available | 7 |
| CodeUltraFeedback: An LLM-as-a-Judge Dataset for Aligning Large Language Models to Coding Preferences | Mar 14, 2024 | HumanEval | CodeCode Available | 7 |
| VMamba: Visual State Space Model | Jan 18, 2024 | Computational EfficiencyLanguage Modeling | CodeCode Available | 7 |
| In-Context LoRA for Diffusion Transformers | Oct 31, 2024 | Image Generation | CodeCode Available | 7 |
| Rethinking the Sample Relations for Few-Shot Classification | Jan 23, 2025 | ClassificationContrastive Learning | CodeCode Available | 7 |
| xLSTM: Extended Long Short-Term Memory | May 7, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 7 |