| AI Research Agents for Machine Learning: Search, Exploration, and Generalization in MLE-bench | Jul 3, 2025 | Navigate | CodeCode Available | 2 |
| Lost in Latent Space: An Empirical Study of Latent Diffusion Models for Physics Emulation | Jul 3, 2025 | DiversityVideo Generation | CodeCode Available | 2 |
| MathOptAI.jl: Embed trained machine learning predictors into JuMP models | Jul 3, 2025 | CPUGaussian Processes | CodeCode Available | 2 |
| MambAttention: Mamba with Multi-Head Attention for Generalizable Single-Channel Speech Enhancement | Jul 1, 2025 | Automatic Speech RecognitionMamba | CodeCode Available | 2 |
| Advancing Learnable Multi-Agent Pathfinding Solvers with Active Fine-Tuning | Jun 30, 2025 | Imitation LearningTrajectory Planning | CodeCode Available | 2 |
| DenseWorld-1M: Towards Detailed Dense Grounded Caption in the Real World | Jun 30, 2025 | Caption GenerationObject | CodeCode Available | 2 |
| NavMorph: A Self-Evolving World Model for Vision-and-Language Navigation in Continuous Environments | Jun 30, 2025 | Decision MakingVision and Language Navigation | CodeCode Available | 2 |
| SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning | Jun 30, 2025 | MathMulti-agent Reinforcement Learning | CodeCode Available | 2 |
| VolumetricSMPL: A Neural Volumetric Body Model for Efficient Interactions, Contacts, and Collisions | Jun 29, 2025 | Computational EfficiencyGPU | CodeCode Available | 2 |
| MEMFOF: High-Resolution Training for Memory-Efficient Multi-Frame Optical Flow Estimation | Jun 29, 2025 | GPUOptical Flow Estimation | CodeCode Available | 2 |
| EAMamba: Efficient All-Around Vision State Space Model for Image Restoration | Jun 27, 2025 | AllDeblurring | CodeCode Available | 2 |
| LLaVA-Scissor: Token Compression with Semantic Connected Components for Video LLMs | Jun 27, 2025 | Question AnsweringVideo Question Answering | CodeCode Available | 2 |
| Seg-R1: Segmentation Can Be Surprisingly Simple with Reinforcement Learning | Jun 27, 2025 | Foreground Segmentationobject-detection | CodeCode Available | 2 |
| R1-Track: Direct Application of MLLMs to Visual Object Tracking via Reinforcement Learning | Jun 27, 2025 | Object TrackingTemplate Matching | CodeCode Available | 2 |
| The Automated LLM Speedrunning Benchmark: Reproducing NanoGPT Improvements | Jun 27, 2025 | | CodeCode Available | 2 |
| Spatial Mental Modeling from Limited Views | Jun 26, 2025 | | CodeCode Available | 2 |
| BMFM-DNA: A SNP-aware DNA foundation model to capture variant effects | Jun 26, 2025 | ImputationPromoter Detection | CodeCode Available | 2 |
| DBConformer: Dual-Branch Convolutional Transformer for EEG Decoding | Jun 26, 2025 | EEGEeg Decoding | CodeCode Available | 2 |
| EraRAG: Efficient and Incremental Retrieval Augmented Generation for Growing Corpora | Jun 26, 2025 | Graph ReconstructionRAG | CodeCode Available | 2 |
| Curve-Aware Gaussian Splatting for 3D Parametric Curve Reconstruction | Jun 26, 2025 | Point cloud reconstruction | CodeCode Available | 2 |
| ESMStereo: Enhanced ShuffleMixer Disparity Upsampling for Real-Time and Accurate Stereo Matching | Jun 26, 2025 | Disparity EstimationStereo Matching | CodeCode Available | 2 |
| KaLM-Embedding-V2: Superior Training Techniques and Data Inspire A Versatile Embedding Model | Jun 26, 2025 | Representation LearningRetrieval | CodeCode Available | 2 |
| HumanOmniV2: From Understanding to Omni-Modal Reasoning with Context | Jun 26, 2025 | Large Language ModelMultimodal Reasoning | CodeCode Available | 2 |
| Parallels Between VLA Model Post-Training and Human Motor Learning: Progress, Challenges, and Trends | Jun 26, 2025 | Action GenerationVision-Language-Action | CodeCode Available | 2 |
| FairyGen: Storied Cartoon Video from a Single Child-Drawn Character | Jun 26, 2025 | | CodeCode Available | 2 |