| Contextualized Automatic Speech Recognition with Dynamic Vocabulary Prediction and Activation | May 29, 2025 | Automatic Speech RecognitionAutomatic Speech Recognition (ASR) | —Unverified | 0 |
| Efficient Quantum Approximate kNN Algorithm via Granular-Ball Computing | May 29, 2025 | Quantization | —Unverified | 0 |
| VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning? | May 29, 2025 | Video Understanding | CodeCode Available | 1 |
| LeMoRe: Learn More Details for Lightweight Semantic Segmentation | May 29, 2025 | Computational EfficiencyRepresentation Learning | CodeCode Available | 0 |
| Second Opinion Matters: Towards Adaptive Clinical AI via the Consensus of Expert Model Ensemble | May 29, 2025 | Decision MakingMedQA | —Unverified | 0 |
| Can Large Language Models Challenge CNNs in Medical Image Analysis? | May 29, 2025 | DiagnosticMedical Image Analysis | —Unverified | 0 |
| LLM-Synth4KWS: Scalable Automatic Generation and Synthesis of Confusable Data for Custom Keyword Spotting | May 29, 2025 | Keyword Spottingtext-to-speech | —Unverified | 0 |
| Bi-Residual Neural Network based Synchronous Motor Electrical Faults Diagnosis: Intra-link Layer Design for High-frequency Features | May 29, 2025 | Fault Diagnosis | —Unverified | 0 |
| Semantics-Aware Human Motion Generation from Audio Instructions | May 29, 2025 | Motion Generation | —Unverified | 0 |
| Interturn Fault Detection in IPMSMs: Two Adaptive Observer-based Solutions | May 29, 2025 | Fault Detectionparameter estimation | —Unverified | 0 |
| SpatialSplat: Efficient Semantic 3D from Sparse Unposed Images | May 29, 2025 | 3D Reconstruction | —Unverified | 0 |
| SAMamba: Adaptive State Space Modeling with Hierarchical Vision for Infrared Small Target Detection | May 29, 2025 | Domain Adaptationfeature selection | CodeCode Available | 1 |
| SWE-bench Goes Live! | May 29, 2025 | | CodeCode Available | 2 |
| Nosey: Open-source hardware for acoustic nasalance | May 29, 2025 | | CodeCode Available | 0 |
| Multilook Coherent Imaging: Theoretical Guarantees and Algorithms | May 29, 2025 | | CodeCode Available | 0 |
| PreFM: Online Audio-Visual Event Parsing via Predictive Future Modeling | May 29, 2025 | Video Understanding | CodeCode Available | 1 |
| Diffusion-Based Generative Models for 3D Occupancy Prediction in Autonomous Driving | May 29, 2025 | Autonomous Driving | —Unverified | 0 |
| MermaidFlow: Redefining Agentic Workflow Generation via Safety-Constrained Evolutionary Programming | May 29, 2025 | DiversityEfficient Exploration | CodeCode Available | 2 |
| Deep Learning-Based CSI Feedback for Wi-Fi Systems With Temporal Correlation | May 29, 2025 | Decoder | —Unverified | 0 |
| Diffusion Sampling Path Tells More: An Efficient Plug-and-Play Strategy for Sample Filtering | May 29, 2025 | DenoisingImage Generation | CodeCode Available | 0 |
| Cognitive Guardrails for Open-World Decision Making in Autonomous Drone Swarms | May 29, 2025 | Decision MakingDecision Making Under Uncertainty | —Unverified | 0 |
| Going from a Representative Agent to Counterfactuals in Combinatorial Choice | May 29, 2025 | counterfactualCounterfactual Inference | —Unverified | 0 |
| DRO: A Python Library for Distributionally Robust Optimization in Machine Learning | May 29, 2025 | | CodeCode Available | 2 |
| Synthetic Document Question Answering in Hungarian | May 29, 2025 | Optical Character Recognition (OCR)Question Answering | CodeCode Available | 0 |
| Two Is Better Than One: Rotations Scale LoRAs | May 29, 2025 | Mixture-of-Experts | —Unverified | 0 |
| Language-guided Learning for Object Detection Tackling Multiple Variations in Aerial Images | May 29, 2025 | Novel Object DetectionObject | —Unverified | 0 |
| Spoken Language Modeling with Duration-Penalized Self-Supervised Units | May 29, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 0 |
| Holistic Large-Scale Scene Reconstruction via Mixed Gaussian Splatting | May 29, 2025 | 3D Scene ReconstructionGPU | CodeCode Available | 1 |
| A Gibbs Sampler for Efficient Bayesian Inference in Sign-Identified SVARs | May 29, 2025 | Bayesian Inference | —Unverified | 0 |
| PixelThink: Towards Efficient Chain-of-Pixel Reasoning | May 29, 2025 | Reasoning Segmentationreinforcement-learning | —Unverified | 0 |
| Few-Shot Speech Deepfake Detection Adaptation with Gaussian Processes | May 29, 2025 | Audio Deepfake DetectionDeepFake Detection | CodeCode Available | 0 |
| (U)NFV: Supervised and Unsupervised Neural Finite Volume Methods for Solving Hyperbolic PDEs | May 29, 2025 | | CodeCode Available | 0 |
| From Knowledge to Noise: CTIM-Rover and the Pitfalls of Episodic Memory in Software Engineering Agents | May 29, 2025 | AI AgentMixture-of-Experts | CodeCode Available | 0 |
| Machine Learning Framework for Characterizing Processing-Structure Relationship in Block Copolymer Thin Films | May 29, 2025 | | CodeCode Available | 0 |
| Fast Derivative Valuation from Volatility Surfaces using Machine Learning | May 29, 2025 | GPR | CodeCode Available | 0 |
| Improved Learning via k-DTW: A Novel Dissimilarity Measure for Curves | May 29, 2025 | Dynamic Time Warping | —Unverified | 0 |
| Bayesian Perspective on Memorization and Reconstruction | May 29, 2025 | Memorization | —Unverified | 0 |
| Dynamic Estimation Loss Control in Variational Quantum Sensing via Online Conformal Inference | May 29, 2025 | Gravitational Wave Detection | —Unverified | 0 |
| LUMION: Fast Fault Recovery for ML Jobs Using Programmable Optical Fabrics | May 29, 2025 | GPU | —Unverified | 0 |
| Optimizing Connectivity and Scheduling of Near/Far Field Users in Massive MIMO NOMA System | May 29, 2025 | ClusteringFairness | —Unverified | 0 |
| CF-DETR: Coarse-to-Fine Transformer for Real-Time Object Detection | May 29, 2025 | GPUobject-detection | —Unverified | 0 |
| Towards Explainable Sequential Learning | May 29, 2025 | Time Series | —Unverified | 0 |
| Robust and Annotation-Free Wound Segmentation on Noisy Real-World Pressure Ulcer Images: Towards Automated DESIGN-R Assessment | May 29, 2025 | Segmentation | —Unverified | 0 |
| A Computational Approach to Improving Fairness in K-means Clustering | May 29, 2025 | ClusteringFairness | —Unverified | 0 |
| DeepFilterGAN: A Full-band Real-time Speech Enhancement System with GAN-based Stochastic Regeneration | May 29, 2025 | Speech Enhancement | —Unverified | 0 |
| Spoken question answering for visual queries | May 29, 2025 | Question AnsweringVisual Question Answering (VQA) | —Unverified | 0 |
| HMAD: Advancing E2E Driving with Anchored Offset Proposals and Simulation-Supervised Multi-target Scoring | May 29, 2025 | Autonomous Driving | —Unverified | 0 |
| PhotoArtAgent: Intelligent Photo Retouching with Language Model-Based Artist Agents | May 29, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| LODGE: Level-of-Detail Large-Scale Gaussian Splatting with Efficient Rendering | May 29, 2025 | 3DGSGPU | —Unverified | 0 |
| RoboTransfer: Geometry-Consistent Video Diffusion for Robotic Visual Policy Transfer | May 29, 2025 | Imitation LearningVideo Generation | —Unverified | 0 |