| Vision Transformers Don't Need Trained Registers | Jun 9, 2025 | | CodeCode Available | 2 |
| Thinking vs. Doing: Agents that Reason by Scaling Test-Time Interaction | Jun 9, 2025 | Reinforcement Learning (RL) | CodeCode Available | 2 |
| CausalPFN: Amortized Causal Effect Estimation via In-Context Learning | Jun 9, 2025 | Decision MakingHeterogeneous Treatment Effect Estimation | CodeCode Available | 2 |
| HeuriGym: An Agentic Benchmark for LLM-Crafted Heuristics in Combinatorial Optimization | Jun 9, 2025 | Combinatorial OptimizationMemorization | CodeCode Available | 2 |
| FunDiff: Diffusion Models over Function Spaces for Physics-Informed Generative Modeling | Jun 9, 2025 | Density Estimation | CodeCode Available | 2 |
| OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation | Jun 9, 2025 | Image Generation | CodeCode Available | 2 |
| Audio synthesizer inversion in symmetric parameter spaces with approximately equivariant flow matching | Jun 8, 2025 | | CodeCode Available | 2 |
| Overclocking LLM Reasoning: Monitoring and Controlling Thinking Path Lengths in LLMs | Jun 8, 2025 | | CodeCode Available | 2 |
| RecGPT: A Foundation Model for Sequential Recommendation | Jun 6, 2025 | Decodermodel | CodeCode Available | 2 |
| Generating Long Semantic IDs in Parallel for Recommendation | Jun 6, 2025 | | CodeCode Available | 2 |
| Contrastive Flow Matching | Jun 5, 2025 | | CodeCode Available | 2 |
| Kinetics: Rethinking Test-Time Scaling Laws | Jun 5, 2025 | | CodeCode Available | 2 |
| Search Arena: Analyzing Search-Augmented LLMs | Jun 5, 2025 | Fact Checking | CodeCode Available | 2 |
| MegaHan97K: A Large-Scale Dataset for Mega-Category Chinese Character Recognition with over 97K Categories | Jun 5, 2025 | BenchmarkingOptical Character Recognition | CodeCode Available | 2 |
| Scaling Laws for Robust Comparison of Open Foundation Language-Vision Models and Datasets | Jun 5, 2025 | | CodeCode Available | 2 |
| Perceive Anything: Recognize, Explain, Caption, and Segment Anything in Images and Videos | Jun 5, 2025 | GPUSemantic Segmentation | CodeCode Available | 2 |
| Unifying Appearance Codes and Bilateral Grids for Driving Scene Gaussian Splatting | Jun 5, 2025 | Autonomous DrivingNeRF | CodeCode Available | 2 |
| EMBER2024 -- A Benchmark Dataset for Holistic Evaluation of Malware Classifiers | Jun 5, 2025 | Malware AnalysisMalware Classification | CodeCode Available | 2 |
| Exploring Diffusion Transformer Designs via Grafting | Jun 5, 2025 | | CodeCode Available | 2 |
| VideoMolmo: Spatio-Temporal Grounding Meets Pointing | Jun 5, 2025 | Autonomous DrivingAutonomous Navigation | CodeCode Available | 2 |
| A Smooth Sea Never Made a Skilled SAILOR: Robust Imitation via Learning to Search | Jun 5, 2025 | Imitation Learning | CodeCode Available | 2 |
| AliTok: Towards Sequence Modeling Alignment between Tokenizer and Autoregressive Model | Jun 5, 2025 | DecoderImage Generation | CodeCode Available | 2 |
| SparseMM: Head Sparsity Emerges from Visual Concept Responses in MLLMs | Jun 5, 2025 | | CodeCode Available | 2 |
| MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning | Jun 5, 2025 | MathMathematical Reasoning | CodeCode Available | 2 |
| Facial Appearance Capture at Home with Patch-Level Reflectance Prior | Jun 4, 2025 | | CodeCode Available | 2 |
| chemtrain-deploy: A parallel and scalable framework for machine learning potentials in million-atom MD simulations | Jun 4, 2025 | Graph Neural Network | CodeCode Available | 2 |
| Multi-view Surface Reconstruction Using Normal and Reflectance Cues | Jun 4, 2025 | Surface Reconstruction | CodeCode Available | 2 |
| Photoreal Scene Reconstruction from an Egocentric Device | Jun 4, 2025 | | CodeCode Available | 2 |
| LeanExplore: A search engine for Lean 4 declarations | Jun 4, 2025 | Automated Theorem Proving | CodeCode Available | 2 |
| Savage-Dickey density ratio estimation with normalizing flows for Bayesian model comparison | Jun 4, 2025 | Density Ratio Estimation | CodeCode Available | 2 |
| ORV: 4D Occupancy-centric Robot Video Generation | Jun 3, 2025 | Video Generation | CodeCode Available | 2 |
| CyberGym: Evaluating AI Agents' Cybersecurity Capabilities with Real-World Vulnerabilities at Scale | Jun 3, 2025 | Large Language Model | CodeCode Available | 2 |
| Simulate Any Radar: Attribute-Controllable Radar Simulation via Waveform Parameter Embedding | Jun 3, 2025 | 3D Object DetectionAttribute | CodeCode Available | 2 |
| Towards In-the-wild 3D Plane Reconstruction from a Single Image | Jun 3, 2025 | 3D Plane Detection | CodeCode Available | 2 |
| KVCache Cache in the Wild: Characterizing and Optimizing KVCache Cache at a Large Cloud Provider | Jun 3, 2025 | | CodeCode Available | 2 |
| Revisiting End-to-End Learning with Slide-level Supervision in Computational Pathology | Jun 3, 2025 | Multiple Instance LearningPrognosis | CodeCode Available | 2 |
| Demystifying Reasoning Dynamics with Mutual Information: Thinking Tokens are Information Peaks in LLM Reasoning | Jun 3, 2025 | | CodeCode Available | 2 |
| HyperSteer: Activation Steering at Scale with Hypernetworks | Jun 3, 2025 | Dictionary LearningText Generation | CodeCode Available | 2 |
| Reasoning-Table: Exploring Reinforcement Learning for Table Reasoning | Jun 2, 2025 | Fact VerificationLanguage Modeling | CodeCode Available | 2 |
| Compiler Optimization via LLM Reasoning for Efficient Model Serving | Jun 2, 2025 | Compiler OptimizationLarge Language Model | CodeCode Available | 2 |
| Reinforcement Learning Tuning for VideoLLMs: Reward Design and Data Efficiency | Jun 2, 2025 | reinforcement-learningReinforcement Learning | CodeCode Available | 2 |
| GSCodec Studio: A Modular Framework for Gaussian Splat Compression | Jun 2, 2025 | Benchmarking | CodeCode Available | 2 |
| The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning | Jun 2, 2025 | MathMathematical Reasoning | CodeCode Available | 2 |
| Synthesis of discrete-continuous quantum circuits with multimodal diffusion models | Jun 2, 2025 | DenoisingParameter Prediction | CodeCode Available | 2 |
| DualMap: Online Open-Vocabulary Semantic Mapping for Natural Language Navigation in Dynamic Changing Scenes | Jun 2, 2025 | Natural Language QueriesNavigate | CodeCode Available | 2 |
| AceVFI: A Comprehensive Survey of Advances in Video Frame Interpolation | Jun 1, 2025 | MambaMotion Compensation | CodeCode Available | 2 |
| FusionAudio-1.2M: Towards Fine-grained Audio Captioning with Multimodal Contextual Fusion | Jun 1, 2025 | Audio captioningCaption Generation | CodeCode Available | 2 |
| MagiCodec: Simple Masked Gaussian-Injected Codec for High-Fidelity Reconstruction and Generation | May 31, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| CineMA: A Foundation Model for Cine Cardiac MRI | May 31, 2025 | Myocardium Segmentation | CodeCode Available | 2 |
| AnnaAgent: Dynamic Evolution Agent System with Multi-Session Memory for Realistic Seeker Simulation | May 31, 2025 | | CodeCode Available | 2 |