| D2AF: A Dual-Driven Annotation and Filtering Framework for Visual Grounding | May 30, 2025 | DiversityPseudo Label | —Unverified | 0 |
| Spatiotemporal Analysis of Forest Machine Operations Using 3D Video Classification | May 30, 2025 | Activity RecognitionVideo Classification | —Unverified | 0 |
| PCIE_Pose Solution for EgoExo4D Pose and Proficiency Estimation Challenge | May 30, 2025 | Pose Estimation | —Unverified | 0 |
| SA-Person: Text-Based Person Retrieval with Scene-aware Re-ranking | May 30, 2025 | Cross-Modal RetrievalPerson Retrieval | —Unverified | 0 |
| Reason-SVG: Hybrid Reward RL for Aha-Moments in Vector Graphics Generation | May 30, 2025 | Reinforcement Learning (RL)Vector Graphics | —Unverified | 0 |
| SARD: A Large-Scale Synthetic Arabic OCR Dataset for Book-Style Text Recognition | May 30, 2025 | Optical Character RecognitionOptical Character Recognition (OCR) | —Unverified | 0 |
| A Cross Branch Fusion-Based Contrastive Learning Framework for Point Cloud Self-supervised Learning | May 30, 2025 | Contrastive LearningSelf-Supervised Learning | —Unverified | 0 |
| DreamDance: Animating Character Art via Inpainting Stable Gaussian Worlds | May 30, 2025 | Image InpaintingVideo Generation | —Unverified | 0 |
| Lightweight Relational Embedding in Task-Interpolated Few-Shot Networks for Enhanced Gastrointestinal Disease Classification | May 30, 2025 | DiagnosticFew-Shot Learning | —Unverified | 0 |
| TalkingHeadBench: A Multi-Modal Benchmark & Analysis of Talking-Head DeepFake Detection | May 30, 2025 | DeepFake DetectionFace Swapping | —Unverified | 0 |
| MiniMax-Remover: Taming Bad Noise Helps Video Object Removal | May 30, 2025 | Video EditingVideo Generation | —Unverified | 0 |
| AdaHuman: Animatable Detailed 3D Human Generation with Compositional Multiview Diffusion | May 30, 2025 | 3DGSImage to 3D | —Unverified | 0 |
| A Novel Coronary Artery Registration Method Based on Super-pixel Particle Swarm Optimization | May 30, 2025 | AnatomyImage Registration | —Unverified | 0 |
| Digital twins enable full-reference quality assessment of photoacoustic image reconstructions | May 30, 2025 | Full reference image quality assessmentFull-Reference Image Quality Assessment | —Unverified | 0 |
| TumorGen: Boundary-Aware Tumor-Mask Synthesis with Rectified Flow Matching | May 30, 2025 | Computational EfficiencyDenoising | —Unverified | 0 |
| Contrast-Invariant Self-supervised Segmentation for Quantitative Placental MRI | May 30, 2025 | Domain AdaptationSegmentation | —Unverified | 0 |
| Speech Token Prediction via Compressed-to-fine Language Modeling for Speech Generation | May 30, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| Edge Computing for Physics-Driven AI in Computational MRI: A Feasibility Study | May 30, 2025 | Computational EfficiencyEdge-computing | —Unverified | 0 |
| Adversarial Threat Vectors and Risk Mitigation for Retrieval-Augmented Generation Systems | May 30, 2025 | Adversarial AttackData Poisoning | —Unverified | 0 |
| Heterogeneous Graph Backdoor Attack | May 30, 2025 | Backdoor Attackbackdoor defense | —Unverified | 0 |
| ClinBench-HPB: A Clinical Benchmark for Evaluating LLMs in Hepato-Pancreato-Biliary Diseases | May 30, 2025 | Medical Question AnsweringMultiple-choice | —Unverified | 0 |
| MythTriage: Scalable Detection of Opioid Use Disorder Myths on a Video-Sharing Platform | May 30, 2025 | Language ModelingLanguage Modelling | —Unverified | 0 |
| An AI-powered Knowledge Hub for Potato Functional Genomics | May 30, 2025 | AI AgentHallucination | —Unverified | 0 |
| Randomized Dimensionality Reduction for Euclidean Maximization and Diversity Measures | May 30, 2025 | Dimensionality ReductionDiversity | —Unverified | 0 |
| Beyond FACS: Data-driven Facial Expression Dictionaries, with Application to Predicting Autism | May 30, 2025 | | CodeCode Available | 1 |
| Conformal Prediction for Zero-Shot Models | May 30, 2025 | Conformal PredictionPrediction | CodeCode Available | 1 |
| Weakly-Supervised Affordance Grounding Guided by Part-Level Semantic Priors | May 30, 2025 | Human-Object Interaction DetectionSemantic Segmentation | CodeCode Available | 1 |
| Shuffle PatchMix Augmentation with Confidence-Margin Weighted Pseudo-Labels for Enhanced Source-Free Domain Adaptation | May 30, 2025 | Data AugmentationDomain Adaptation | CodeCode Available | 0 |
| Revisiting Cross-Modal Knowledge Distillation: A Disentanglement Approach for RGBD Semantic Segmentation | May 30, 2025 | Autonomous DrivingContrastive Learning | CodeCode Available | 0 |
| EgoExOR: An Ego-Exo-Centric Operating Room Dataset for Surgical Activity Understanding | May 30, 2025 | Action RecognitionGraph Generation | CodeCode Available | 1 |
| S3CE-Net: Spike-guided Spatiotemporal Semantic Coupling and Expansion Network for Long Sequence Event Re-Identification | May 30, 2025 | Person Re-Identification | CodeCode Available | 0 |
| NUC-Net: Non-uniform Cylindrical Partition Network for Efficient LiDAR Semantic Segmentation | May 30, 2025 | Autonomous DrivingGPU | CodeCode Available | 0 |
| Efficient RAW Image Deblurring with Adaptive Frequency Modulation | May 30, 2025 | Computational EfficiencyDeblurring | CodeCode Available | 1 |
| On Designing Diffusion Autoencoders for Efficient Generation and Representation Learning | May 30, 2025 | DenoisingRepresentation Learning | CodeCode Available | 0 |
| MultiHoax: A Dataset of Multi-hop False-Premise Questions | May 30, 2025 | | CodeCode Available | 0 |
| Training-free zero-shot 3D symmetry detection with visual features back-projected to geometry | May 30, 2025 | Symmetry Detection | —Unverified | 0 |
| IRBridge: Solving Image Restoration Bridge with Pre-trained Generative Diffusion Models | May 30, 2025 | Image Restoration | CodeCode Available | 1 |
| SORCE: Small Object Retrieval in Complex Environments | May 30, 2025 | BenchmarkingImage Retrieval | CodeCode Available | 0 |
| Unleashing the Power of Intermediate Domains for Mixed Domain Semi-Supervised Medical Image Segmentation | May 30, 2025 | Domain AdaptationImage Segmentation | CodeCode Available | 0 |
| ViStoryBench: Comprehensive Benchmark Suite for Story Visualization | May 30, 2025 | Story Visualization | CodeCode Available | 2 |
| SiLVR: A Simple Language-based Video Reasoning Framework | May 30, 2025 | MathMME | CodeCode Available | 1 |
| pyMEAL: A Multi-Encoder Augmentation-Aware Learning for Robust and Generalizable Medical Image Translation | May 30, 2025 | Computed Tomography (CT)SSIM | CodeCode Available | 0 |
| TRAPDOC: Deceiving LLM Users by Injecting Imperceptible Phantom Tokens into Documents | May 30, 2025 | | CodeCode Available | 0 |
| Learning reusable concepts across different egocentric video understanding tasks | May 30, 2025 | Video Understanding | —Unverified | 0 |
| Who Gets the Kidney? Human-AI Alignment, Indecision, and Moral Values | May 30, 2025 | Decision Making | —Unverified | 0 |
| Leveraging Intermediate Features of Vision Transformer for Face Anti-Spoofing | May 30, 2025 | Data AugmentationFace Anti-Spoofing | —Unverified | 0 |
| When GPT Spills the Tea: Comprehensive Assessment of Knowledge File Leakage in GPTs | May 30, 2025 | Large Language Model | —Unverified | 0 |
| Towards Secure MLOps: Surveying Attacks, Mitigation Strategies, and Research Challenges | May 30, 2025 | Red Teaming | —Unverified | 0 |
| Sample-optimal learning of quantum states using gentle measurements | May 30, 2025 | LEMMA | —Unverified | 0 |
| Tradeoffs between Mistakes and ERM Oracle Calls in Online and Transductive Online Learning | May 30, 2025 | 2k | —Unverified | 0 |