| RL-LLM-DT: An Automatic Decision Tree Generation Method Based on RL Evaluation and LLM Enhancement | Dec 16, 2024 | Reinforcement Learning (RL) | CodeCode Available | 1 |
| Data-driven Precipitation Nowcasting Using Satellite Imagery | Dec 16, 2024 | Precipitation Forecasting | CodeCode Available | 1 |
| AMI-Net: Adaptive Mask Inpainting Network for Industrial Anomaly Detection and Localization | Dec 16, 2024 | Anomaly Detection | CodeCode Available | 1 |
| Exploring Semantic Consistency and Style Diversity for Domain Generalized Semantic Segmentation | Dec 16, 2024 | DiversitySemantic Segmentation | CodeCode Available | 1 |
| IDProtector: An Adversarial Noise Encoder to Protect Against ID-Preserving Image Generation | Dec 16, 2024 | Image Generation | CodeCode Available | 1 |
| Relieving Universal Label Noise for Unsupervised Visible-Infrared Person Re-Identification by Inferring from Neighbors | Dec 16, 2024 | Person Re-Identification | CodeCode Available | 1 |
| GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training | Dec 16, 2024 | Geometry Problem Solving | CodeCode Available | 1 |
| Learning Normal Flow Directly From Event Neighborhoods | Dec 15, 2024 | Data AugmentationOptical Flow Estimation | CodeCode Available | 1 |
| OTLRM: Orthogonal Learning-based Low-Rank Metric for Multi-Dimensional Inverse Problems | Dec 15, 2024 | DenoisingImage Denoising | CodeCode Available | 1 |
| Are Expressive Models Truly Necessary for Offline RL? | Dec 15, 2024 | D4RLOffline RL | CodeCode Available | 1 |
| NITRO: LLM Inference on Intel Laptop NPUs | Dec 15, 2024 | CPUGPU | CodeCode Available | 1 |
| Volumetric Mapping with Panoptic Refinement via Kernel Density Estimation for Mobile Robots | Dec 15, 2024 | 3D Reconstruction3D Scene Reconstruction | CodeCode Available | 1 |
| Entropy-Regularized Process Reward Model | Dec 15, 2024 | GSM8KMath | CodeCode Available | 1 |
| GraphMoRE: Mitigating Topological Heterogeneity via Mixture of Riemannian Experts | Dec 15, 2024 | | CodeCode Available | 1 |
| Unpaired Multi-Domain Histopathology Virtual Staining using Dual Path Prompted Inversion | Dec 15, 2024 | DiagnosticDisentanglement | CodeCode Available | 1 |
| HC-LLM: Historical-Constrained Large Language Models for Radiology Report Generation | Dec 15, 2024 | DiagnosticIn-Context Learning | CodeCode Available | 1 |
| Why and How: Knowledge-Guided Learning for Cross-Spectral Image Patch Matching | Dec 15, 2024 | Metric LearningPatch Matching | CodeCode Available | 1 |
| ViPOcc: Leveraging Visual Priors from Vision Foundation Models for Single-View 3D Occupancy Prediction | Dec 15, 2024 | Autonomous DrivingDepth Estimation | CodeCode Available | 1 |
| Smaller Language Models Are Better Instruction Evolvers | Dec 15, 2024 | | CodeCode Available | 1 |
| Light-T2M: A Lightweight and Fast Model for Text-to-motion Generation | Dec 15, 2024 | GPUMamba | CodeCode Available | 1 |
| Semi-Implicit Neural Ordinary Differential Equations | Dec 15, 2024 | Graph ClassificationGraph Learning | CodeCode Available | 1 |
| MoRe: Class Patch Attention Needs Regularization for Weakly Supervised Semantic Segmentation | Dec 15, 2024 | Semantic SegmentationWeakly supervised Semantic Segmentation | CodeCode Available | 1 |
| Latent Reward: LLM-Empowered Credit Assignment in Episodic Reinforcement Learning | Dec 15, 2024 | Decision MakingLarge Language Model | CodeCode Available | 1 |
| AD-LLM: Benchmarking Large Language Models for Anomaly Detection | Dec 15, 2024 | Anomaly DetectionBenchmarking | CodeCode Available | 1 |
| Explainable Fuzzy Neural Network with Multi-Fidelity Reinforcement Learning for Micro-Architecture Design Space Exploration | Dec 14, 2024 | Bayesian OptimizationDecision Making | CodeCode Available | 1 |
| RapidNet: Multi-Level Dilated Convolution Based Mobile Backbone | Dec 14, 2024 | image-classificationImage Classification | CodeCode Available | 1 |
| SusGen-GPT: A Data-Centric LLM for Financial NLP and Sustainability Report Generation | Dec 14, 2024 | RAGRetrieval-augmented Generation | CodeCode Available | 1 |
| Enhance Vision-Language Alignment with Noise | Dec 14, 2024 | Variational Inference | CodeCode Available | 1 |
| Attention-driven GUI Grounding: Leveraging Pretrained Multimodal Large Language Models without Fine-Tuning | Dec 14, 2024 | TAG | CodeCode Available | 1 |
| Video Diffusion Transformers are In-Context Learners | Dec 14, 2024 | Video Generation | CodeCode Available | 1 |
| Rethinking Chain-of-Thought from the Perspective of Self-Training | Dec 14, 2024 | Computational Efficiency | CodeCode Available | 1 |
| DCSEG: Decoupled 3D Open-Set Segmentation using Gaussian Splatting | Dec 14, 2024 | 3D ReconstructionSegmentation | CodeCode Available | 1 |
| Grid: Omni Visual Generation | Dec 14, 2024 | Image GenerationScheduling | CodeCode Available | 1 |
| ST-FiT: Inductive Spatial-Temporal Forecasting with Limited Training Data | Dec 14, 2024 | Data Augmentation | CodeCode Available | 1 |
| Heterogeneous Graph Transformer for Multiple Tiny Object Tracking in RGB-T Videos | Dec 14, 2024 | Multi-Object TrackingMultiple Object Tracking | CodeCode Available | 1 |
| LAN: Learning to Adapt Noise for Image Denoising | Dec 14, 2024 | DenoisingImage Denoising | CodeCode Available | 1 |
| DSRC: Learning Density-insensitive and Semantic-aware Collaborative Representation against Corruptions | Dec 14, 2024 | 3D Object Detectionobject-detection | CodeCode Available | 1 |
| CaLoRAify: Calorie Estimation with Visual-Text Pairing and LoRA-Driven Visual Language Models | Dec 13, 2024 | RAG | CodeCode Available | 1 |
| The Complexity Dynamics of Grokking | Dec 13, 2024 | Generalization BoundsMemorization | CodeCode Available | 1 |
| Enhancing Multimodal Large Language Models Complex Reason via Similarity Computation | Dec 13, 2024 | Token Reduction | CodeCode Available | 1 |
| waveOrder: generalist framework for label-agnostic computational microscopy | Dec 13, 2024 | | CodeCode Available | 1 |
| A2RNet: Adversarial Attack Resilient Network for Robust Infrared and Visible Image Fusion | Dec 13, 2024 | Adversarial AttackInfrared And Visible Image Fusion | CodeCode Available | 1 |
| Building a Multi-modal Spatiotemporal Expert for Zero-shot Action Recognition with CLIP | Dec 13, 2024 | Action RecognitionText Augmentation | CodeCode Available | 1 |
| Investigating generalization capabilities of neural networks by means of loss landscapes and Hessian analysis | Dec 13, 2024 | | CodeCode Available | 1 |
| WiseAD: Knowledge Augmented End-to-End Autonomous Driving with Vision-Language Model | Dec 13, 2024 | Autonomous DrivingDecision Making | CodeCode Available | 1 |
| Prompt2Perturb (P2P): Text-Guided Diffusion-Based Adversarial Attacks on Breast Ultrasound Images | Dec 13, 2024 | Prompt Learning | CodeCode Available | 1 |
| TSGaussian: Semantic and Depth-Guided Target-Specific Gaussian Splatting from Sparse Views | Dec 13, 2024 | Interactive SegmentationNovel View Synthesis | CodeCode Available | 1 |
| Real-time Identity Defenses against Malicious Personalization of Diffusion Models | Dec 13, 2024 | CPUGPU | CodeCode Available | 1 |
| EVOS: Efficient Implicit Neural Training via EVOlutionary Selector | Dec 13, 2024 | Evolutionary AlgorithmsSelection bias | CodeCode Available | 1 |
| CognitionCapturer: Decoding Visual Stimuli From Human EEG Signal With Multimodal Information | Dec 13, 2024 | EEGElectroencephalogram (EEG) | CodeCode Available | 1 |