| Diffusion Buffer: Online Diffusion-based Speech Enhancement with Sub-Second Latency | Jun 3, 2025 | GPUSpeech Enhancement | —Unverified | 0 |
| Prosodic Structure Beyond Lexical Content: A Study of Self-Supervised Learning | Jun 3, 2025 | Emotion RecognitionReading Comprehension | —Unverified | 0 |
| TO-GATE: Clarifying Questions and Summarizing Responses with Trajectory Optimization for Eliciting Human Preference | Jun 3, 2025 | Question GenerationQuestion-Generation | —Unverified | 0 |
| Hyperspectral Image Generation with Unmixing Guided Diffusion Model | Jun 3, 2025 | Hyperspectral UnmixingImage Generation | —Unverified | 0 |
| Modelling the Effects of Hearing Loss on Neural Coding in the Auditory Midbrain with Variational Conditioning | Jun 3, 2025 | Bayesian Optimisation | —Unverified | 0 |
| Derivation of CRB and Refined SINR Expressions for OTFS-RSMA LEO ISAC Systems | Jun 3, 2025 | ISAC | —Unverified | 0 |
| Minimally Invasive Brain Computer Interfaces: Evaluating the Impact of Tissue Layers on Signal Quality of Sub-Scalp EEG | Jun 3, 2025 | Brain Computer InterfaceEEG | —Unverified | 0 |
| A Pre-trained Framework for Multilingual Brain Decoding Using Non-invasive Recordings | Jun 3, 2025 | Brain DecodingFairness | —Unverified | 0 |
| Enhancing Neural Autoregressive Distribution Estimators for Image Reconstruction | Jun 3, 2025 | Image Reconstruction | —Unverified | 0 |
| Rethinking Whole-Body CT Image Interpretation: An Abnormality-Centric Approach | Jun 3, 2025 | | —Unverified | 0 |
| UniSite: The First Cross-Structure Dataset and Learning Framework for End-to-End Ligand Binding Site Detection | Jun 3, 2025 | Drug DesignPrediction | CodeCode Available | 1 |
| TriPSS: A Tri-Modal Keyframe Extraction Framework Using Perceptual, Structural, and Semantic Representations | Jun 3, 2025 | RetrievalVideo Summarization | —Unverified | 0 |
| Tactile MNIST: Benchmarking Active Tactile Perception | Jun 3, 2025 | BenchmarkingScene Understanding | —Unverified | 0 |
| FailureSensorIQ: A Multi-Choice QA Dataset for Understanding Sensor Relationships and Failure Modes | Jun 3, 2025 | BenchmarkingFeature Engineering | CodeCode Available | 0 |
| OpenCarbon: A Contrastive Learning-based Cross-Modality Neural Approach for High-Resolution Carbon Emission Prediction Using Open Data | Jun 3, 2025 | Contrastive Learning | CodeCode Available | 0 |
| Elasticity of substitution and general model of economic growth | Jun 3, 2025 | Position | —Unverified | 0 |
| How stealthy is stealthy? Studying the Efficacy of Black-Box Adversarial Attacks in the Real World | Jun 3, 2025 | Autonomous Vehicles | —Unverified | 0 |
| Deep Learning Enhanced Multivariate GARCH | Jun 3, 2025 | Deep Learning | —Unverified | 0 |
| Multi-Exit Kolmogorov-Arnold Networks: enhancing accuracy and parsimony | Jun 3, 2025 | Kolmogorov-Arnold Networksscientific discovery | —Unverified | 0 |
| HATA: Trainable and Hardware-Efficient Hash-Aware Top-k Attention for Scalable Large Model Inference | Jun 3, 2025 | | CodeCode Available | 1 |
| METok: Multi-Stage Event-based Token Compression for Efficient Long Video Understanding | Jun 3, 2025 | Video Understanding | CodeCode Available | 0 |
| Demystifying Reasoning Dynamics with Mutual Information: Thinking Tokens are Information Peaks in LLM Reasoning | Jun 3, 2025 | | CodeCode Available | 2 |
| Attention-based transformer models for image captioning across languages: An in-depth survey and evaluation | Jun 3, 2025 | Caption GenerationImage Captioning | —Unverified | 0 |
| Attacking Attention of Foundation Models Disrupts Downstream Tasks | Jun 3, 2025 | Depth EstimationImage-text Retrieval | CodeCode Available | 0 |
| Talk2SAM: Text-Guided Semantic Enhancement for Complex-Shaped Object Segmentation | Jun 3, 2025 | SegmentationSemantic Segmentation | CodeCode Available | 0 |
| ChemGraph: An Agentic Framework for Computational Chemistry Workflows | Jun 3, 2025 | Computational chemistryGraph Neural Network | —Unverified | 0 |
| Deep Learning Enhanced Multi-Day Turnover Quantitative Trading Algorithm for Chinese A-Share Market | Jun 3, 2025 | Stock Prediction | —Unverified | 0 |
| Revisiting End-to-End Learning with Slide-level Supervision in Computational Pathology | Jun 3, 2025 | Multiple Instance LearningPrognosis | CodeCode Available | 2 |
| FAuNO: Semi-Asynchronous Federated Reinforcement Learning Framework for Task Offloading in Edge Systems | Jun 3, 2025 | Edge-computing | —Unverified | 0 |
| Mitigating Non-IID Drift in Zeroth-Order Federated LLM Fine-Tuning with Transferable Sparsity | Jun 3, 2025 | Federated Learning | —Unverified | 0 |
| TaxAgent: How Large Language Model Designs Fiscal Policy | Jun 3, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| FORLA:Federated Object-centric Representation Learning with Slot Attention | Jun 3, 2025 | DecoderFederated Learning | —Unverified | 0 |
| HGOT: Self-supervised Heterogeneous Graph Neural Network with Optimal Transport | Jun 3, 2025 | Graph Neural NetworkNode Classification | —Unverified | 0 |
| ViTNF: Leveraging Neural Fields to Boost Vision Transformers in Generalized Category Discovery | Jun 3, 2025 | Few-Shot Learning | —Unverified | 0 |
| Rewarding the Unlikely: Lifting GRPO Beyond Distribution Sharpening | Jun 3, 2025 | Automated Theorem Proving | —Unverified | 0 |
| EALG: Evolutionary Adversarial Generation of Language Model-Guided Generators for Combinatorial Optimization | Jun 3, 2025 | Combinatorial OptimizationLanguage Modeling | —Unverified | 0 |
| VTGaussian-SLAM: RGBD SLAM for Large Scale Scenes with Splatting View-Tied 3D Gaussians | Jun 3, 2025 | GPUSimultaneous Localization and Mapping | —Unverified | 0 |
| The Future of Continual Learning in the Era of Foundation Models: Three Key Directions | Jun 3, 2025 | Continual Learning | —Unverified | 0 |
| Hierarchical Self-Prompting SAM: A Prompt-Free Medical Image Segmentation Framework | Jun 3, 2025 | Image SegmentationLesion Segmentation | —Unverified | 0 |
| On the Robustness of Tabular Foundation Models: Test-Time Attacks and In-Context Defenses | Jun 3, 2025 | In-Context Learning | —Unverified | 0 |
| Robustness in Both Domains: CLIP Needs a Robust Text Encoder | Jun 3, 2025 | | —Unverified | 0 |
| A Survey of Deep Learning Video Super-Resolution | Jun 3, 2025 | Deep LearningSuper-Resolution | —Unverified | 0 |
| A Smart Multimodal Healthcare Copilot with Powerful LLM Reasoning | Jun 3, 2025 | Decision MakingDiagnostic | CodeCode Available | 3 |
| Co-Evolving LLM Coder and Unit Tester via Reinforcement Learning | Jun 3, 2025 | Code Generationreinforcement-learning | CodeCode Available | 4 |
| Application of convolutional neural networks in image super-resolution | Jun 3, 2025 | Image Super-ResolutionSuper-Resolution | —Unverified | 0 |
| SVGenius: Benchmarking LLMs in SVG Understanding, Editing and Generation | Jun 3, 2025 | BenchmarkingStyle Transfer | —Unverified | 0 |
| ORV: 4D Occupancy-centric Robot Video Generation | Jun 3, 2025 | Video Generation | CodeCode Available | 2 |
| Response-Level Rewards Are All You Need for Online Reinforcement Learning in LLMs: A Mathematical Perspective | Jun 3, 2025 | All | —Unverified | 0 |
| FlowerTune: A Cross-Domain Benchmark for Federated Fine-Tuning of Large Language Models | Jun 3, 2025 | BenchmarkingDomain Adaptation | —Unverified | 0 |
| Sociodynamics-inspired Adaptive Coalition and Client Selection in Federated Learning | Jun 3, 2025 | Federated LearningPrivacy Preserving | —Unverified | 0 |