| EgoToM: Benchmarking Theory of Mind Reasoning from Egocentric Videos | Mar 28, 2025 | BenchmarkingQuestion Answering | CodeCode Available | 1 |
| CoSIL: Software Issue Localization via LLM-Driven Code Repository Graph Searching | Mar 28, 2025 | Program Repair | CodeCode Available | 1 |
| AdaRank: Adaptive Rank Pruning for Enhanced Model Merging | Mar 28, 2025 | Computational Efficiencymodel | CodeCode Available | 1 |
| Mitigating Trade-off: Stream and Query-guided Aggregation for Efficient and Effective 3D Occupancy Prediction | Mar 28, 2025 | Autonomous DrivingScene Understanding | CodeCode Available | 1 |
| Baseline Systems and Evaluation Metrics for Spatial Semantic Segmentation of Sound Scenes | Mar 28, 2025 | Audio TaggingSemantic Segmentation | CodeCode Available | 1 |
| Enhance Generation Quality of Flow Matching V2A Model via Multi-Step CoT-Like Guidance and Combined Preference Optimization | Mar 28, 2025 | Audio GenerationFAD | CodeCode Available | 1 |
| DIFFER: Disentangling Identity Features via Semantic Cues for Clothes-Changing Person Re-ID | Mar 28, 2025 | Clothes Changing Person Re-IdentificationDisentanglement | CodeCode Available | 1 |
| FLIP: Towards Comprehensive and Reliable Evaluation of Federated Prompt Learning | Mar 28, 2025 | Federated LearningPrompt Learning | CodeCode Available | 1 |
| Recurrent Feature Mining and Keypoint Mixup Padding for Category-Agnostic Pose Estimation | Mar 27, 2025 | Category-Agnostic Pose EstimationPose Estimation | CodeCode Available | 1 |
| Fine-Grained Evaluation of Large Vision-Language Models in Autonomous Driving | Mar 27, 2025 | AttributeAutonomous Driving | CodeCode Available | 1 |
| BOOTPLACE: Bootstrapped Object Placement with Detection Transformers | Mar 27, 2025 | Data AugmentationObject | CodeCode Available | 1 |
| ReaRAG: Knowledge-guided Reasoning Enhances Factuality of Large Reasoning Models with Iterative Retrieval Augmented Generation | Mar 27, 2025 | Question AnsweringRAG | CodeCode Available | 1 |
| FineCIR: Explicit Parsing of Fine-Grained Modification Semantics for Composed Image Retrieval | Mar 27, 2025 | Image RetrievalRetrieval | CodeCode Available | 1 |
| InternVL-X: Advancing and Accelerating InternVL Series with Efficient Visual Token Compression | Mar 27, 2025 | Computational EfficiencyLarge Language Model | CodeCode Available | 1 |
| uLayout: Unified Room Layout Estimation for Perspective and Panoramic Images | Mar 27, 2025 | Room Layout Estimation | CodeCode Available | 1 |
| LOCORE: Image Re-ranking with Long-Context Sequence Modeling | Mar 27, 2025 | Image RetrievalRe-Ranking | CodeCode Available | 1 |
| Data-Agnostic Robotic Long-Horizon Manipulation with Vision-Language-Guided Closed-Loop Feedback | Mar 27, 2025 | Task Planning | CodeCode Available | 1 |
| Learning Class Prototypes for Unified Sparse Supervised 3D Object Detection | Mar 27, 2025 | 3D Object DetectionObject | CodeCode Available | 1 |
| Multi-Scale Invertible Neural Network for Wide-Range Variable-Rate Learned Image Compression | Mar 27, 2025 | Image Compression | CodeCode Available | 1 |
| The MVTec AD 2 Dataset: Advanced Scenarios for Unsupervised Anomaly Detection | Mar 27, 2025 | Anomaly DetectionUnsupervised Anomaly Detection | CodeCode Available | 1 |
| Test-Time Visual In-Context Tuning | Mar 27, 2025 | In-Context Learning | CodeCode Available | 1 |
| Omni-AD: Learning to Reconstruct Global and Local Features for Multi-class Anomaly Detection | Mar 27, 2025 | Anomaly DetectionDecoder | CodeCode Available | 1 |
| BOLT: Boost Large Vision-Language Model Without Training for Long-form Video Understanding | Mar 27, 2025 | FormLanguage Modeling | CodeCode Available | 1 |
| R-PRM: Reasoning-Driven Process Reward Modeling | Mar 27, 2025 | Mathematical Reasoning | CodeCode Available | 1 |
| Reinforced Model Merging | Mar 27, 2025 | model | CodeCode Available | 1 |
| ZJUKLAB at SemEval-2025 Task 4: Unlearning via Model Merging | Mar 27, 2025 | | CodeCode Available | 1 |
| The Procedural Content Generation Benchmark: An Open-source Testbed for Generative Challenges in Games | Mar 27, 2025 | Diversity | CodeCode Available | 1 |
| Data Poisoning in Deep Learning: A Survey | Mar 27, 2025 | Data PoisoningDeep Learning | CodeCode Available | 1 |
| Semantic Library Adaptation: LoRA Retrieval and Fusion for Open-Vocabulary Semantic Segmentation | Mar 27, 2025 | Domain AdaptationOpen Vocabulary Semantic Segmentation | CodeCode Available | 1 |
| A friendly introduction to triangular transport | Mar 27, 2025 | Bayesian InferenceDecision Making | CodeCode Available | 1 |
| On Large Multimodal Models as Open-World Image Classifiers | Mar 27, 2025 | image-classificationImage Classification | CodeCode Available | 1 |
| FaceBench: A Multi-View Multi-Level Facial Attribute VQA Dataset for Benchmarking Face Perception MLLMs | Mar 27, 2025 | AttributeBenchmarking | CodeCode Available | 1 |
| OpenHuEval: Evaluating Large Language Model on Hungarian Specifics | Mar 27, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| VADMamba: Exploring State Space Models for Fast Video Anomaly Detection | Mar 27, 2025 | Anomaly DetectionComputational Efficiency | CodeCode Available | 1 |
| UGNA-VPR: A Novel Training Paradigm for Visual Place Recognition Based on Uncertainty-Guided NeRF Augmentation | Mar 27, 2025 | Autonomous NavigationData Augmentation | CodeCode Available | 1 |
| Empowering Retrieval-based Conversational Recommendation with Contrasting User Preferences | Mar 27, 2025 | Conversational RecommendationRecommendation Systems | CodeCode Available | 1 |
| DSU-Net:An Improved U-Net Model Based on DINOv2 and SAM2 with Multi-scale Cross-model Feature Enhancement | Mar 27, 2025 | Image Segmentationmodel | CodeCode Available | 1 |
| ThinkEdit: Interpretable Weight Editing to Mitigate Overly Short Thinking in Reasoning Models | Mar 27, 2025 | Math | CodeCode Available | 1 |
| A Comprehensive Benchmark for RNA 3D Structure-Function Modeling | Mar 27, 2025 | BenchmarkingDeep Learning | CodeCode Available | 1 |
| LOCATEdit: Graph Laplacian Optimized Cross Attention for Localized Text-Guided Image Editing | Mar 27, 2025 | text-guided-image-editing | CodeCode Available | 1 |
| Comprehensive segmentation of deep grey nuclei from structural MRI data | Mar 27, 2025 | Segmentation | CodeCode Available | 1 |
| Learning from spatially inhomogenous data: resolution-adaptive convolutions for multiple sclerosis lesion segmentation | Mar 26, 2025 | Lesion Segmentation | CodeCode Available | 1 |
| BioX-CPath: Biologically-driven Explainable Diagnostics for Multistain IHC Computational Pathology | Mar 26, 2025 | Explainable ModelsGraph Neural Network | CodeCode Available | 1 |
| Devil is in the Uniformity: Exploring Diverse Learners within Transformer for Image Restoration | Mar 26, 2025 | DenoisingImage Restoration | CodeCode Available | 1 |
| Fast, Modular, and Differentiable Framework for Machine Learning-Enhanced Molecular Simulations | Mar 26, 2025 | | CodeCode Available | 1 |
| SChanger: Change Detection from a Semantic Change and Spatial Consistency Perspective | Mar 26, 2025 | Change Detection | CodeCode Available | 1 |
| 3MDBench: Medical Multimodal Multi-agent Dialogue Benchmark | Mar 26, 2025 | DiagnosticMultimodal Reasoning | CodeCode Available | 1 |
| EGVD: Event-Guided Video Diffusion Model for Physically Realistic Large-Motion Frame Interpolation | Mar 26, 2025 | Video Frame Interpolation | CodeCode Available | 1 |
| sudo rm -rf agentic_security | Mar 26, 2025 | Adversarial AttackAI and Safety | CodeCode Available | 1 |
| Siformer: Feature-isolated Transformer for Efficient Skeleton-based Sign Language Recognition | Mar 26, 2025 | Action RecognitionComputational Efficiency | CodeCode Available | 1 |