| Gumbel-max List Sampling for Distribution Coupling with Multiple Samples | Jun 5, 2025 | LEMMA | —Unverified | 0 |
| Efficient Robust Conformal Prediction via Lipschitz-Bounded Networks | Jun 5, 2025 | Adversarial AttackComputational Efficiency | CodeCode Available | 0 |
| Noninvasive precision modulation of high-level neural population activity via natural vision perturbations | Jun 5, 2025 | | CodeCode Available | 0 |
| Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models | Jun 5, 2025 | AllMath | —Unverified | 0 |
| MTPNet: Multi-Grained Target Perception for Unified Activity Cliff Prediction | Jun 5, 2025 | Drug DiscoveryPrediction | CodeCode Available | 1 |
| An SCMA Receiver for 6G NTN based on Multi-Task Learning | Jun 5, 2025 | Edge-computingMulti-Task Learning | —Unverified | 0 |
| Joint Beamforming and Integer User Association using a GNN with Gumbel-Softmax Reparameterizations | Jun 5, 2025 | Graph Neural Network | —Unverified | 0 |
| MonkeyOCR: Document Parsing with a Structure-Recognition-Relation Triplet Paradigm | Jun 5, 2025 | GPURelation | CodeCode Available | 9 |
| Direct Numerical Layout Generation for 3D Indoor Scene Synthesis via Spatial Reasoning | Jun 5, 2025 | In-Context LearningIndoor Scene Synthesis | —Unverified | 0 |
| LLM-First Search: Self-Guided Exploration of the Solution Space | Jun 5, 2025 | | CodeCode Available | 1 |
| Demonstrations of Integrity Attacks in Multi-Agent Systems | Jun 5, 2025 | Code GenerationNatural Language Understanding | —Unverified | 0 |
| LogicPuzzleRL: Cultivating Robust Mathematical Reasoning in LLMs via Reinforcement Learning | Jun 5, 2025 | Mathematical Reasoningreinforcement-learning | CodeCode Available | 0 |
| PixCell: A generative foundation model for digital histopathology images | Jun 5, 2025 | Cell SegmentationData Augmentation | —Unverified | 0 |
| Reasoning or Overthinking: Evaluating Large Language Models on Financial Sentiment Analysis | Jun 5, 2025 | Sentiment AnalysisSentiment Classification | —Unverified | 0 |
| Adaptive Preconditioners Trigger Loss Spikes in Adam | Jun 5, 2025 | Attribute | —Unverified | 0 |
| Mathematical Reasoning for Unmanned Aerial Vehicles: A RAG-Based Approach for Complex Arithmetic Reasoning | Jun 5, 2025 | Arithmetic ReasoningMath | CodeCode Available | 0 |
| DM-SegNet: Dual-Mamba Architecture for 3D Medical Image Segmentation with Global Context Modeling | Jun 5, 2025 | AnatomyBrain Tumor Segmentation | —Unverified | 0 |
| SUCEA: Reasoning-Intensive Retrieval for Adversarial Fact-checking through Claim Decomposition and Editing | Jun 5, 2025 | Fact CheckingMisinformation | CodeCode Available | 0 |
| Knowledgeable-r1: Policy Optimization for Knowledge Exploration in Retrieval-Augmented Generation | Jun 5, 2025 | counterfactualRAG | CodeCode Available | 0 |
| Counterfactual reasoning: an analysis of in-context emergence | Jun 5, 2025 | counterfactualCounterfactual Reasoning | CodeCode Available | 0 |
| DACN: Dual-Attention Convolutional Network for Hyperspectral Image Super-Resolution | Jun 5, 2025 | Hyperspectral Image Super-ResolutionImage Super-Resolution | CodeCode Available | 0 |
| MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning | Jun 5, 2025 | MathMathematical Reasoning | CodeCode Available | 2 |
| SparseMM: Head Sparsity Emerges from Visual Concept Responses in MLLMs | Jun 5, 2025 | | CodeCode Available | 2 |
| On the Comprehensibility of Multi-structured Financial Documents using LLMs and Pre-processing Tools | Jun 5, 2025 | | CodeCode Available | 0 |
| Can Foundation Models Generalise the Presentation Attack Detection Capabilities on ID Cards? | Jun 5, 2025 | Diversity | —Unverified | 0 |
| TextVidBench: A Benchmark for Long Video Scene Text Understanding | Jun 5, 2025 | Prompt EngineeringQuestion Answering | —Unverified | 0 |
| Neural Inverse Rendering from Propagating Light | Jun 5, 2025 | 3D ReconstructionInverse Rendering | —Unverified | 0 |
| Revisiting Depth Representations for Feed-Forward 3D Gaussian Splatting | Jun 5, 2025 | 3DGSNovel View Synthesis | —Unverified | 0 |
| ContentV: Efficient Training of Video Generation Models with Limited Compute | Jun 5, 2025 | Image GenerationVideo Generation | —Unverified | 0 |
| Robustness as Architecture: Designing IQA Models to Withstand Adversarial Perturbations | Jun 5, 2025 | Image Quality Assessment | —Unverified | 0 |
| APVR: Hour-Level Long Video Understanding with Adaptive Pivot Visual Information Retrieval | Jun 5, 2025 | Information RetrievalRetrieval | —Unverified | 0 |
| Bringing SAM to new heights: Leveraging elevation data for tree crown segmentation from drone imagery | Jun 5, 2025 | Instance SegmentationSemantic Segmentation | —Unverified | 0 |
| Multi-scale Image Super Resolution with a Single Auto-Regressive Model | Jun 5, 2025 | Image Super-ResolutionSuper-Resolution | —Unverified | 0 |
| Interpretable Multimodal Framework for Human-Centered Street Assessment: Integrating Visual-Language Models for Perceptual Urban Diagnostics | Jun 5, 2025 | Large Language Model | —Unverified | 0 |
| PATS: Proficiency-Aware Temporal Sampling for Multi-View Sports Skill Assessment | Jun 5, 2025 | | —Unverified | 0 |
| Beyond Cropped Regions: New Benchmark and Corresponding Baseline for Chinese Scene Text Retrieval in Diverse Layouts | Jun 5, 2025 | RetrievalText Retrieval | —Unverified | 0 |
| Structure-Aware Radar-Camera Depth Estimation | Jun 5, 2025 | Depth EstimationMonocular Depth Estimation | —Unverified | 0 |
| Point Cloud Segmentation of Agricultural Vehicles using 3D Gaussian Splatting | Jun 5, 2025 | 3DGSPoint Cloud Segmentation | —Unverified | 0 |
| UAV4D: Dynamic Neural Rendering of Human-Centric UAV Imagery using Gaussian Splatting | Jun 5, 2025 | Neural RenderingNovel View Synthesis | —Unverified | 0 |
| A Survey on Vietnamese Document Analysis and Recognition: Challenges and Future Directions | Jun 5, 2025 | Computational Efficiencydocument understanding | —Unverified | 0 |
| FG 2025 TrustFAA: the First Workshop on Towards Trustworthy Facial Affect Analysis: Advancing Insights of Fairness, Explainability, and Safety (TrustFAA) | Jun 5, 2025 | Action Unit DetectionDepression Detection | —Unverified | 0 |
| DIMCIM: A Quantitative Evaluation Framework for Default-mode Diversity and Generalization in Text-to-Image Generative Models | Jun 5, 2025 | BenchmarkingDiversity | —Unverified | 0 |
| CIVET: Systematic Evaluation of Understanding in VLMs | Jun 5, 2025 | Object | —Unverified | 0 |
| FRED: The Florence RGB-Event Drone Dataset | Jun 5, 2025 | BenchmarkingTrajectory Forecasting | —Unverified | 0 |
| Track Any Anomalous Object: A Granular Video Anomaly Detection Pipeline | Jun 5, 2025 | Anomaly DetectionAnomaly Localization | —Unverified | 0 |
| Vision-Based Autonomous MM-Wave Reflector Using ArUco-Driven Angle-of-Arrival Estimation | Jun 5, 2025 | Raspberry Pi 4 | —Unverified | 0 |
| EOC-Bench: Can MLLMs Identify, Recall, and Forecast Objects in an Egocentric World? | Jun 5, 2025 | Object | —Unverified | 0 |
| Does Your 3D Encoder Really Work? When Pretrain-SFT from 2D VLMs Meets 3D VLMs | Jun 5, 2025 | cross-modal alignmentDense Captioning | —Unverified | 0 |
| Unleashing Hour-Scale Video Training for Long Video-Language Understanding | Jun 5, 2025 | Instruction FollowingLanguage Modeling | —Unverified | 0 |
| Refer to Anything with Vision-Language Prompts | Jun 5, 2025 | BenchmarkingGeneralized Referring Expression Segmentation | —Unverified | 0 |