| Is Value Learning Really the Main Bottleneck in Offline RL? | Jun 13, 2024 | Imitation LearningOffline RL | CodeCode Available | 3 |
| DANA: Domain-Aware Neurosymbolic Agents for Consistency and Accuracy | Sep 27, 2024 | Financial Analysis | CodeCode Available | 3 |
| Compact 3D Gaussian Splatting for Static and Dynamic Radiance Fields | Aug 7, 2024 | 3DGSModel Compression | CodeCode Available | 3 |
| MAGiC-SLAM: Multi-Agent Gaussian Globally Consistent SLAM | Nov 25, 2024 | Autonomous DrivingNovel View Synthesis | CodeCode Available | 3 |
| Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2 | Aug 9, 2024 | All | CodeCode Available | 3 |
| DPLM-2: A Multimodal Diffusion Protein Language Model | Oct 17, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 3 |
| Automated Formulaic Alpha Generation for Quantitative Investing using Evolutionary Algorithms | Mar 13, 2022 | Evolutionary Algorithms | CodeCode Available | 3 |
| The False Promise of Imitating Proprietary LLMs | May 25, 2023 | Language Modelling | CodeCode Available | 3 |
| Visual Geometry Grounded Deep Structure From Motion | Dec 7, 2023 | Point Tracking | CodeCode Available | 3 |
| A Foundation Model for the Earth System | May 20, 2024 | Computational EfficiencyDeep Learning | CodeCode Available | 3 |
| DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning | Jun 14, 2024 | Offline RL | CodeCode Available | 3 |
| Human-level play in the game of Diplomacy by combining language models with strategic reasoning | Nov 22, 2022 | AI AgentLanguage Modeling | CodeCode Available | 3 |
| Improving Text Embeddings with Large Language Models | Dec 31, 2023 | DecoderDiversity | CodeCode Available | 3 |
| Performance Analysis of Open Source Machine Learning Frameworks for Various Parameters in Single-Threaded and Multi-Threaded Modes | Aug 29, 2017 | BIG-bench Machine LearningCPU | CodeCode Available | 3 |
| Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models | Oct 3, 2024 | | CodeCode Available | 3 |
| RB-Modulation: Training-Free Personalization of Diffusion Models using Stochastic Optimal Control | May 27, 2024 | | CodeCode Available | 3 |
| Towards Generalist Robot Policies: What Matters in Building Vision-Language-Action Models | Dec 18, 2024 | Representation LearningRobot Manipulation | CodeCode Available | 3 |
| RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation | Mar 8, 2024 | Code GenerationHallucination | CodeCode Available | 3 |
| Jumping Ahead: Improving Reconstruction Fidelity with JumpReLU Sparse Autoencoders | Jul 19, 2024 | | CodeCode Available | 3 |
| DataDecide: How to Predict Best Pretraining Data with Small Experiments | Apr 15, 2025 | ARCHellaSwag | CodeCode Available | 3 |
| The Hedgehog & the Porcupine: Expressive Linear Attentions with Softmax Mimicry | Feb 6, 2024 | | CodeCode Available | 3 |
| Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection | Jun 8, 2020 | Dense Object DetectionGeneral Classification | CodeCode Available | 3 |
| Exploring the Performance Improvement of Tensor Processing Engines through Transformation in the Bit-weight Dimension of MACs | Mar 8, 2025 | | CodeCode Available | 3 |
| DRCT: Saving Image Super-resolution away from Information Bottleneck | Mar 31, 2024 | Image Super-ResolutionSuper-Resolution | CodeCode Available | 3 |
| TopoX: A Suite of Python Packages for Machine Learning on Topological Domains | Feb 4, 2024 | | CodeCode Available | 3 |
| OSUM: Advancing Open Speech Understanding Models with Limited Resources in Academia | Jan 23, 2025 | Emotion RecognitionEvent Detection | CodeCode Available | 3 |
| Emu3: Next-Token Prediction is All You Need | Sep 27, 2024 | All | CodeCode Available | 3 |
| Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving | Apr 3, 2025 | Reinforcement Learning (RL) | CodeCode Available | 3 |
| InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation | Mar 3, 2026 | | —Unverified | 2 |
| MM-Zero: Self-Evolving Multi-Model Vision Language Models From Zero Data | Mar 10, 2026 | | —Unverified | 2 |
| HiFi-Inpaint: Towards High-Fidelity Reference-Based Inpainting for Generating Detail-Preserving Human-Product Images | Mar 3, 2026 | | —Unverified | 2 |
| Stochastic Self-Guidance for Training-Free Enhancement of Diffusion Models | Mar 4, 2026 | | —Unverified | 2 |
| Phi-4-reasoning-vision-15B Technical Report | Mar 4, 2026 | | —Unverified | 2 |
| Text-to-3D by Stitching a Multi-view Reconstruction Network to a Video Generator | Mar 5, 2026 | | —Unverified | 2 |
| NeuralRemaster: Phase-Preserving Diffusion for Structure-Aligned Generation | Mar 5, 2026 | | —Unverified | 2 |
| ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding | Mar 5, 2026 | | —Unverified | 2 |
| From Word to World: Can Large Language Models be Implicit Text-based World Models? | Mar 5, 2026 | | —Unverified | 2 |
| Latent Particle World Models: Self-supervised Object-centric Stochastic Dynamics Modeling | Mar 4, 2026 | | —Unverified | 2 |
| Physical Simulator In-the-Loop Video Generation | Mar 6, 2026 | | —Unverified | 2 |
| ArcFlow: Unleashing 2-Step Text-to-Image Generation via High-Precision Non-Linear Flow Distillation | Feb 9, 2026 | | —Unverified | 2 |
| Mantis: A Versatile Vision-Language-Action Model with Disentangled Visual Foresight | Feb 23, 2026 | | —Unverified | 2 |
| Endless Terminals: Scaling RL Environments for Terminal Agents | Feb 14, 2026 | | —Unverified | 2 |
| Experiential Reinforcement Learning | Feb 15, 2026 | | —Unverified | 2 |
| PyVision-RL: Forging Open Agentic Vision Models via RL | Feb 24, 2026 | | —Unverified | 2 |
| TOPReward: Token Probabilities as Hidden Zero-Shot Rewards for Robotics | Feb 22, 2026 | | —Unverified | 2 |
| AIRS-Bench: a Suite of Tasks for Frontier AI Research Science Agents | Feb 16, 2026 | | —Unverified | 2 |
| Should We Still Pretrain Encoders with Masked Language Modeling? | Feb 24, 2026 | | —Unverified | 2 |
| Streaming Autoregressive Video Generation via Diagonal Distillation | Mar 11, 2026 | | —Unverified | 2 |
| Accelerating Streaming Video Large Language Models via Hierarchical Token Compression | Feb 11, 2026 | | —Unverified | 2 |
| LLM2Vec-Gen: Generative Embeddings from Large Language Models | Mar 11, 2026 | | —Unverified | 2 |