| ArenaRL: Scaling RL for Open-Ended Agents via Tournament-based Relative Ranking | Jan 22, 2026 | | —Unverified | 3 |
| Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders | Jan 22, 2026 | | —Unverified | 3 |
| ActionMesh: Animated 3D Mesh Generation with Temporal 3D Diffusion | Jan 22, 2026 | | —Unverified | 3 |
| FourCastNet 3: A geometric approach to probabilistic machine-learning weather forecasting at scale | Jul 16, 2025 | Computational EfficiencyGPU | CodeCode Available | 3 |
| PhysX: Physical-Grounded 3D Asset Generation | Jul 16, 2025 | 3D GenerationImage to 3D | CodeCode Available | 3 |
| Arctic Inference with Shift Parallelism: Fast and Efficient Open Source Inference System for Enterprise AI | Jul 16, 2025 | GPU | CodeCode Available | 3 |
| A Survey on Latent Reasoning | Jul 8, 2025 | Survey | CodeCode Available | 3 |
| Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving | Jul 8, 2025 | Code RepairTransfer Learning | CodeCode Available | 3 |
| DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge | Jul 6, 2025 | Image GenerationMultimodal Reasoning | CodeCode Available | 3 |
| RLVER: Reinforcement Learning with Verifiable Emotion Rewards for Empathetic Agents | Jul 3, 2025 | Emotional Intelligencereinforcement-learning | CodeCode Available | 3 |
| No time to train! Training-Free Reference-Based Instance Segmentation | Jul 3, 2025 | Cross-Domain Few-Shot Object DetectionFew-Shot Object Detection | CodeCode Available | 3 |
| Flash-VStream: Efficient Real-Time Understanding for Long Video Streams | Jun 30, 2025 | cross-modal alignmentEgoSchema | CodeCode Available | 3 |
| L0: Reinforcement Learning to Become General Agents | Jun 30, 2025 | Question Answeringreinforcement-learning | CodeCode Available | 3 |
| Epona: Autoregressive Diffusion World Model for Autonomous Driving | Jun 30, 2025 | Autonomous Drivingmodel | CodeCode Available | 3 |
| Ovis-U1 Technical Report | Jun 29, 2025 | Image GenerationText to Image Generation | CodeCode Available | 3 |
| FineWeb2: One Pipeline to Scale Them All -- Adapting Pre-Training Data Processing to Every Language | Jun 26, 2025 | All | CodeCode Available | 3 |
| The Ideation-Execution Gap: Execution Outcomes of LLM-Generated versus Human Research Ideas | Jun 25, 2025 | | CodeCode Available | 3 |
| MMSearch-R1: Incentivizing LMMs to Search | Jun 25, 2025 | RAGRetrieval-augmented Generation | CodeCode Available | 3 |
| Efficient and Generalizable Speaker Diarization via Structured Pruning of Self-Supervised Models | Jun 23, 2025 | Domain AdaptationGPU | CodeCode Available | 3 |
| ShareGPT-4o-Image: Aligning Multimodal Models with GPT-4o-Level Image Generation | Jun 22, 2025 | GPUImage Generation | CodeCode Available | 3 |
| Machine Mental Imagery: Empower Multimodal Reasoning with Latent Visual Tokens | Jun 20, 2025 | Image GenerationMultimodal Reasoning | CodeCode Available | 3 |
| Camera Calibration via Circular Patterns: A Comprehensive Framework with Measurement Uncertainty and Unbiased Projection Model | Jun 20, 2025 | Camera Calibration | CodeCode Available | 3 |
| TabArena: A Living Benchmark for Machine Learning on Tabular Data | Jun 20, 2025 | Benchmarking | CodeCode Available | 3 |
| Hunyuan3D 2.5: Towards High-Fidelity 3D Assets Generation with Ultimate Details | Jun 19, 2025 | Texture Synthesis | CodeCode Available | 3 |
| AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning | Jun 16, 2025 | Action GenerationAutonomous Driving | CodeCode Available | 3 |
| Vine Copulas as Differentiable Computational Graphs | Jun 16, 2025 | GPUScheduling | CodeCode Available | 3 |
| Arctic Long Sequence Training: Scalable And Efficient Training For Multi-Million Token Sequences | Jun 16, 2025 | Document SummarizationGPU | CodeCode Available | 3 |
| Discrete Diffusion in Large Language and Multimodal Models: A Survey | Jun 16, 2025 | Denoising | CodeCode Available | 3 |
| ANIRA: An Architecture for Neural Network Inference in Real-Time Audio Applications | Jun 14, 2025 | Benchmarking | CodeCode Available | 3 |
| FlexRAG: A Flexible and Comprehensive Framework for Retrieval-Augmented Generation | Jun 14, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| A Comprehensive Survey of Deep Research: Systems, Methodologies, and Applications | Jun 14, 2025 | Information RetrievalSurvey | CodeCode Available | 3 |
| The Diffusion Duality | Jun 12, 2025 | Text Generation | CodeCode Available | 3 |
| Spurious Rewards: Rethinking Training Signals in RLVR | Jun 12, 2025 | MathMathematical Reasoning | CodeCode Available | 3 |
| AniMaker: Automated Multi-Agent Animated Storytelling with MCTS-Driven Clip Generation | Jun 12, 2025 | Video Generation | CodeCode Available | 3 |
| TreeLoRA: Efficient Continual Learning via Layer-Wise LoRAs Guided by a Hierarchical Gradient-Similarity Tree | Jun 12, 2025 | Continual Learning | CodeCode Available | 3 |
| JAFAR: Jack up Any Feature at Any Resolution | Jun 10, 2025 | Feature Upsampling | CodeCode Available | 3 |
| Cosmos-Drive-Dreams: Scalable Synthetic Driving Data Generation with World Foundation Models | Jun 10, 2025 | 3D Lane Detection3D Object Detection | CodeCode Available | 3 |
| MagCache: Fast Video Generation with Magnitude-Aware Cache | Jun 10, 2025 | SSIMVideo Generation | CodeCode Available | 3 |
| Highly Compressed Tokenizer Can Generate Without Training | Jun 9, 2025 | Image GenerationQuantization | CodeCode Available | 3 |
| G-Memory: Tracing Hierarchical Memory for Multi-Agent Systems | Jun 9, 2025 | Large Language Model | CodeCode Available | 3 |
| Hierarchical Lexical Graph for Enhanced Multi-Hop Retrieval | Jun 9, 2025 | Dataset GenerationRAG | CodeCode Available | 3 |
| Real-Time Execution of Action Chunking Flow Policies | Jun 9, 2025 | ChunkingVision-Language-Action | CodeCode Available | 3 |
| Generalized Trajectory Scoring for End-to-end Multimodal Planning | Jun 7, 2025 | Autonomous DrivingDomain Generalization | CodeCode Available | 3 |
| When to use Graphs in RAG: A Comprehensive Analysis for Graph Retrieval-Augmented Generation | Jun 6, 2025 | RAGRetrieval | CodeCode Available | 3 |
| FlashDMoE: Fast Distributed MoE in a Single Kernel | Jun 5, 2025 | 16kCPU | CodeCode Available | 3 |
| SupeRANSAC: One RANSAC to Rule Them All | Jun 5, 2025 | AllPose Estimation | CodeCode Available | 3 |
| INP-Former++: Advancing Universal Anomaly Detection via Intrinsic Normal Prototypes and Residual Learning | Jun 4, 2025 | Anomaly DetectionMedical Diagnosis | CodeCode Available | 3 |
| HtFLlib: A Comprehensive Heterogeneous Federated Learning Library and Benchmark | Jun 4, 2025 | Federated LearningTransfer Learning | CodeCode Available | 3 |
| A Smart Multimodal Healthcare Copilot with Powerful LLM Reasoning | Jun 3, 2025 | Decision MakingDiagnostic | CodeCode Available | 3 |
| Ultra-High-Resolution Image Synthesis: Data, Method and Evaluation | Jun 2, 2025 | 4kDescriptive | CodeCode Available | 3 |