| RAIL: Region-Aware Instructive Learning for Semi-Supervised Tooth Segmentation in CBCT | May 6, 2025 | Transfer Learning | CodeCode Available | 1 |
| Geospatial Mechanistic Interpretability of Large Language Models | May 6, 2025 | Spatial Reasoning | CodeCode Available | 1 |
| Learning-based Homothetic Tube MPC | May 6, 2025 | Model Predictive Control | CodeCode Available | 1 |
| OSUniverse: Benchmark for Multimodal GUI-navigation AI Agents | May 6, 2025 | | CodeCode Available | 1 |
| Mamba-Diffusion Model with Learnable Wavelet for Controllable Symbolic Music Generation | May 6, 2025 | Image GenerationMamba | CodeCode Available | 1 |
| Panoramic Out-of-Distribution Segmentation | May 6, 2025 | DisentanglementDomain Generalization | CodeCode Available | 1 |
| Blending 3D Geometry and Machine Learning for Multi-View Stereopsis | May 6, 2025 | 3D geometry | CodeCode Available | 1 |
| Fixed-Length Dense Fingerprint Representation | May 6, 2025 | | CodeCode Available | 1 |
| Bounding Box-Guided Diffusion for Synthesizing Industrial Images and Segmentation Map | May 6, 2025 | Dataset GenerationSegmentation | CodeCode Available | 1 |
| WebGen-Bench: Evaluating LLMs on Generating Interactive and Functional Websites from Scratch | May 6, 2025 | | CodeCode Available | 1 |
| IndicSQuAD: A Comprehensive Multilingual Question Answering Dataset for Indic Languages | May 6, 2025 | Question Answering | CodeCode Available | 1 |
| CombiBench: Benchmarking LLM Capability for Combinatorial Mathematics | May 6, 2025 | Benchmarking | CodeCode Available | 1 |
| Framework GNN-AID: Graph Neural Network Analysis Interpretation and Defense | May 6, 2025 | Graph Neural Network | CodeCode Available | 1 |
| Multi-View Learning with Context-Guided Receptance for Image Denoising | May 5, 2025 | Computational EfficiencyDenoising | CodeCode Available | 1 |
| fastabx: A library for efficient computation of ABX discriminability | May 5, 2025 | Representation Learning | CodeCode Available | 1 |
| Token Coordinated Prompt Attention is Needed for Visual Prompting | May 5, 2025 | DiversityVisual Prompting | CodeCode Available | 1 |
| RADLADS: Rapid Attention Distillation to Linear Attention Decoders at Scale | May 5, 2025 | Decoder | CodeCode Available | 1 |
| Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL | May 5, 2025 | Mathematical Reasoning | CodeCode Available | 1 |
| Rewriting Pre-Training Data Boosts LLM Performance in Math and Code | May 5, 2025 | Code GenerationGSM8K | CodeCode Available | 1 |
| SEFE: Superficial and Essential Forgetting Eliminator for Multimodal Continual Instruction Tuning | May 5, 2025 | | CodeCode Available | 1 |
| CreoPep: A Universal Deep Learning Framework for Target-Specific Peptide Design and Optimization | May 5, 2025 | DiversityLanguage Modeling | CodeCode Available | 1 |
| Knowing You Don't Know: Learning When to Continue Search in Multi-round RAG through Self-Practicing | May 5, 2025 | In-Context Reinforcement LearningRAG | CodeCode Available | 1 |
| Towards Quantifying the Hessian Structure of Neural Networks | May 5, 2025 | | CodeCode Available | 1 |
| ReplaceMe: Network Simplification via Layer Pruning and Linear Transformations | May 5, 2025 | Network Pruning | CodeCode Available | 1 |
| AutoLibra: Agent Metric Induction from Open-Ended Feedback | May 5, 2025 | Prompt Engineering | CodeCode Available | 1 |
| Advancing Generalizable Tumor Segmentation with Anomaly-Aware Open-Vocabulary Attention Maps and Frozen Foundation Diffusion Models | May 5, 2025 | Anomaly SegmentationSegmentation | CodeCode Available | 1 |
| Invoke Interfaces Only When Needed: Adaptive Invocation for Large Language Models in Question Answering | May 5, 2025 | HallucinationQuestion Answering | CodeCode Available | 1 |
| Uncertainty-Weighted Image-Event Multimodal Fusion for Video Anomaly Detection | May 5, 2025 | Anomaly DetectionAnomaly Detection In Surveillance Videos | CodeCode Available | 1 |
| MUSAR: Exploring Multi-Subject Customization from Single-Subject Dataset via Attention Routing | May 5, 2025 | Attribute | CodeCode Available | 1 |
| NTIRE 2025 Challenge on UGC Video Enhancement: Methods and Results | May 5, 2025 | Video Enhancement | CodeCode Available | 1 |
| CASA: CNN Autoencoder-based Score Attention for Efficient Multivariate Long-term Time-series Forecasting | May 4, 2025 | Time SeriesTime Series Forecasting | CodeCode Available | 1 |
| Adaptive Thinking via Mode Policy Optimization for Social Language Agents | May 4, 2025 | | CodeCode Available | 1 |
| RTV-Bench: Benchmarking MLLM Continuous Perception, Understanding and Reasoning through Real-Time Video | May 4, 2025 | BenchmarkingQuestion Answering | CodeCode Available | 1 |
| Cricket: A Self-Powered Chirping Pixel | May 4, 2025 | | CodeCode Available | 1 |
| Small Clips, Big Gains: Learning Long-Range Refocused Temporal Information for Video Super-Resolution | May 4, 2025 | Computational EfficiencyImage Super-Resolution | CodeCode Available | 1 |
| ProDisc-VAD: An Efficient System for Weakly-Supervised Anomaly Detection in Video Surveillance Applications | May 4, 2025 | Anomaly Detection In Surveillance VideosContrastive Learning | CodeCode Available | 1 |
| Attention Mechanisms Perspective: Exploring LLM Processing of Graph-Structured Data | May 4, 2025 | | CodeCode Available | 1 |
| HybridGS: High-Efficiency Gaussian Splatting Data Compression using Dual-Channel Sparse Representation and Point Cloud Encoder | May 3, 2025 | 3DGSData Compression | CodeCode Available | 1 |
| Accelerating Volumetric Medical Image Annotation via Short-Long Memory SAM 2 | May 3, 2025 | Computed Tomography (CT)Semantic Segmentation | CodeCode Available | 1 |
| DualDiff: Dual-branch Diffusion Model for Autonomous Driving with Semantic Fusion | May 3, 2025 | 3D Object DetectionAutonomous Driving | CodeCode Available | 1 |
| An LSTM-PINN Hybrid Method to the specific problem of population forecasting | May 3, 2025 |
| CodeCode Available | 1 |
| Morello: Compiling Fast Neural Networks with Dynamic Programming and Spatial Compression | May 3, 2025 | CPU | CodeCode Available | 1 |
| FreePCA: Integrating Consistency Information across Long-short Frames in Training-free Long Video Generation via Principal Component Analysis | May 2, 2025 | Video Generation | CodeCode Available | 1 |
| 2DXformer: Dual Transformers for Wind Power Forecasting with Dual Exogenous Variables | May 2, 2025 | | CodeCode Available | 1 |
| VideoHallu: Evaluating and Mitigating Multi-modal Hallucinations on Synthetic Video Understanding | May 2, 2025 | Anomaly DetectionCommon Sense Reasoning | CodeCode Available | 1 |
| CAV-MAE Sync: Improving Contrastive Audio-Visual Mask Autoencoders via Fine-Grained Alignment | May 2, 2025 | audio-visual learningcross-modal alignment | CodeCode Available | 1 |
| CDFormer: Cross-Domain Few-Shot Object Detection Transformer Against Feature Confusion | May 2, 2025 | Cross-Domain Few-ShotCross-Domain Few-Shot Object Detection | CodeCode Available | 1 |
| Autonomous Embodied Agents: When Robotics Meets Deep Learning Reasoning | May 2, 2025 | Deep Learning | CodeCode Available | 1 |
| Carbon Aware Transformers Through Joint Model-Hardware Optimization | May 2, 2025 | model | CodeCode Available | 1 |
| SpectrumFM: A Foundation Model for Intelligent Spectrum Management | May 2, 2025 | Anomaly DetectionFew-Shot Learning | CodeCode Available | 1 |