| MeshFleet: Filtered and Annotated 3D Vehicle Dataset for Domain Specific Generative Modeling | Mar 18, 2025 | | CodeCode Available | 1 |
| Improving Adaptive Density Control for 3D Gaussian Splatting | Mar 18, 2025 | 3DGSNovel View Synthesis | CodeCode Available | 1 |
| Empowering Smaller Models: Tuning LLaMA and Gemma with Chain-of-Thought for Ukrainian Exam Tasks | Mar 18, 2025 | GPUparameter-efficient fine-tuning | CodeCode Available | 1 |
| FusDreamer: Label-efficient Remote Sensing World Model for Multimodal Data Classification | Mar 18, 2025 | Combinatorial OptimizationContrastive Learning | CodeCode Available | 1 |
| Anomaly-Flow: A Multi-domain Federated Generative Adversarial Network for Distributed Denial-of-Service Detection | Mar 18, 2025 | Federated LearningGenerative Adversarial Network | CodeCode Available | 1 |
| SimWorld: A Unified Benchmark for Simulator-Conditioned Scene Generation via World Model | Mar 18, 2025 | Autonomous DrivingImage Generation | CodeCode Available | 1 |
| MamBEV: Enabling State Space Models to Learn Birds-Eye-View Representations | Mar 18, 2025 | Autonomous DrivingMamba | CodeCode Available | 1 |
| DIFFVSGG: Diffusion-Driven Online Video Scene Graph Generation | Mar 18, 2025 | DenoisingGPU | CodeCode Available | 1 |
| EEG-CLIP : Learning EEG representations from natural language descriptions | Mar 18, 2025 | Contrastive LearningEEG | CodeCode Available | 1 |
| JuDGE: Benchmarking Judgment Document Generation for Chinese Legal System | Mar 18, 2025 | BenchmarkingIn-Context Learning | CodeCode Available | 1 |
| Robust Object Detection of Underwater Robot based on Domain Generalization | Mar 18, 2025 | Domain GeneralizationObject | CodeCode Available | 1 |
| RAGO: Systematic Performance Optimization for Retrieval-Augmented Generation Serving | Mar 18, 2025 | RAGRetrieval | CodeCode Available | 1 |
| State Space Model Meets Transformer: A New Paradigm for 3D Object Detection | Mar 18, 2025 | 3D Object DetectionDecoder | CodeCode Available | 1 |
| MP-GUI: Modality Perception with MLLMs for GUI Understanding | Mar 18, 2025 | | CodeCode Available | 1 |
| Make Your Training Flexible: Towards Deployment-Efficient Video Models | Mar 18, 2025 | Action ClassificationZero-Shot Video Retrieval | CodeCode Available | 1 |
| AIGVE-Tool: AI-Generated Video Evaluation Toolkit with Multifaceted Benchmark | Mar 18, 2025 | Video Generation | CodeCode Available | 1 |
| DPImageBench: A Unified Benchmark for Differentially Private Image Synthesis | Mar 18, 2025 | Image Generation | CodeCode Available | 1 |
| RoGSplat: Learning Robust Generalizable Human Gaussian Splatting from Sparse Multi-View Images | Mar 18, 2025 | Novel View Synthesis | CodeCode Available | 1 |
| A Comprehensive Survey on Cross-Domain Recommendation: Taxonomy, Progress, and Prospects | Mar 18, 2025 | ArticlesModel Optimization | CodeCode Available | 1 |
| Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM | Mar 18, 2025 | | CodeCode Available | 1 |
| Multi-Prototype Embedding Refinement for Semi-Supervised Medical Image Segmentation | Mar 18, 2025 | Image SegmentationMedical Image Segmentation | CodeCode Available | 1 |
| Inferring Event Descriptions from Time Series with Language Models | Mar 18, 2025 | Time Series | CodeCode Available | 1 |
| Image Captioning Evaluation in the Age of Multimodal LLMs: Challenges and Future Perspectives | Mar 18, 2025 | Image Captioning | CodeCode Available | 1 |
| UncTrack: Reliable Visual Object Tracking with Uncertainty-Aware Prototype Memory Network | Mar 17, 2025 | Object TrackingVisual Object Tracking | CodeCode Available | 1 |
| Scale Efficient Training for Large Datasets | Mar 17, 2025 | geo-localizationImage Retrieval | CodeCode Available | 1 |
| Mind the Gap: Confidence Discrepancy Can Guide Federated Semi-Supervised Learning Across Pseudo-Mismatch | Mar 17, 2025 | Federated LearningPseudo Label | CodeCode Available | 1 |
| Interpretable Unsupervised Joint Denoising and Enhancement for Real-World low-light Scenarios | Mar 17, 2025 | Denoising | CodeCode Available | 1 |
| TriLiteNet: Lightweight Model for Multi-Task Visual Perception | Mar 17, 2025 | Autonomous DrivingComputational Efficiency | CodeCode Available | 1 |
| Strain Problems got you in a Twist? Try StrainRelief: A Quantum-Accurate Tool for Ligand Strain Calculations | Mar 17, 2025 | Drug DesignDrug Discovery | CodeCode Available | 1 |
| MSWAL: 3D Multi-class Segmentation of Whole Abdominal Lesions Dataset | Mar 17, 2025 | Transfer Learning | CodeCode Available | 1 |
| DPC: Dual-Prompt Collaboration for Tuning Vision-Language Models | Mar 17, 2025 | | CodeCode Available | 1 |
| DLPO: Towards a Robust, Efficient, and Generalizable Prompt Optimization Framework from a Deep-Learning Perspective | Mar 17, 2025 | | CodeCode Available | 1 |
| SatDepth: A Novel Dataset for Satellite Image Matching | Mar 17, 2025 | | CodeCode Available | 1 |
| From Zero to Detail: Deconstructing Ultra-High-Definition Image Restoration from Progressive Spectral Perspective | Mar 17, 2025 | Image RestorationKolmogorov-Arnold Networks | CodeCode Available | 1 |
| NuPlanQA: A Large-Scale Dataset and Benchmark for Multi-View Driving Scene Understanding in Multi-Modal Large Language Models | Mar 17, 2025 | Question AnsweringScene Understanding | CodeCode Available | 1 |
| An interpretable approach to automating the assessment of biofouling in video footage | Mar 17, 2025 | | CodeCode Available | 1 |
| Sampling Innovation-Based Adaptive Compressive Sensing | Mar 17, 2025 | Compressive SensingImage Reconstruction | CodeCode Available | 1 |
| Omnia de EgoTempo: Benchmarking Temporal Understanding of Multi-Modal LLMs in Egocentric Videos | Mar 17, 2025 | BenchmarkingQuestion Answering | CodeCode Available | 1 |
| A Multi-Power Law for Loss Curve Prediction Across Learning Rate Schedules | Mar 17, 2025 | | CodeCode Available | 1 |
| A General Adaptive Dual-level Weighting Mechanism for Remote Sensing Pansharpening | Mar 17, 2025 | Pansharpening | CodeCode Available | 1 |
| Enhancing LLM Reasoning with Iterative DPO: A Comprehensive Empirical Investigation | Mar 17, 2025 | Mathematical ReasoningReinforcement Learning (RL) | CodeCode Available | 1 |
| How Good is my Histopathology Vision-Language Foundation Model? A Holistic Benchmark | Mar 17, 2025 | | CodeCode Available | 1 |
| Lifting the Veil on Visual Information Flow in MLLMs: Unlocking Pathways to Faster Inference | Mar 17, 2025 | | CodeCode Available | 1 |
| Can Language Models Follow Multiple Turns of Entangled Instructions? | Mar 17, 2025 | Instruction FollowingMemorization | CodeCode Available | 1 |
| Prompt Flow Integrity to Prevent Privilege Escalation in LLM Agents | Mar 17, 2025 | | CodeCode Available | 1 |
| MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research | Mar 17, 2025 | ArticlesBenchmarking | CodeCode Available | 1 |
| FNSE-SBGAN: Far-field Speech Enhancement with Schrodinger Bridge and Generative Adversarial Networks | Mar 17, 2025 | Speech Enhancement | CodeCode Available | 1 |
| Grounded Chain-of-Thought for Multimodal Large Language Models | Mar 17, 2025 | HallucinationSpatial Reasoning | CodeCode Available | 1 |
| UCF-Crime-DVS: A Novel Event-Based Dataset for Video Anomaly Detection with Spiking Neural Networks | Mar 17, 2025 | Anomaly DetectionOptical Flow Estimation | CodeCode Available | 1 |
| FedVSR: Towards Model-Agnostic Federated Learning in Video Super-Resolution | Mar 17, 2025 | Federated LearningFederated Learning (Video Super-Resolution) | CodeCode Available | 1 |