| DELT: A Simple Diversity-driven EarlyLate Training for Dataset Distillation | Nov 29, 2024 | Dataset DistillationDiversity | CodeCode Available | 1 |
| GREAT: Geometry-Intention Collaborative Inference for Open-Vocabulary 3D Object Affordance Grounding | Nov 29, 2024 | Collaborative InferenceObject | CodeCode Available | 1 |
| ContextGNN: Beyond Two-Tower Recommendation Systems | Nov 29, 2024 | Link PredictionRecommendation Systems | CodeCode Available | 1 |
| On the Performance Analysis of Momentum Method: A Frequency Domain Perspective | Nov 29, 2024 | Image Classification | CodeCode Available | 1 |
| Deepfake Media Generation and Detection in the Generative AI Era: A Survey and Outlook | Nov 29, 2024 | DeepFake DetectionFace Swapping | CodeCode Available | 1 |
| Multiview Equivariance Improves 3D Correspondence Understanding with Minimal Feature Finetuning | Nov 29, 2024 | Pose Estimation | CodeCode Available | 1 |
| ForgerySleuth: Empowering Multimodal Large Language Models for Image Manipulation Detection | Nov 29, 2024 | Image ManipulationImage Manipulation Detection | CodeCode Available | 1 |
| Another look at inference after prediction | Nov 29, 2024 | Prediction | CodeCode Available | 1 |
| ChineseWebText 2.0: Large-Scale High-quality Chinese Web Text with Multi-dimensional and fine-grained information | Nov 29, 2024 | | CodeCode Available | 1 |
| Truth or Mirage? Towards End-to-End Factuality Evaluation with LLM-Oasis | Nov 29, 2024 | BenchmarkingClaim Verification | CodeCode Available | 1 |
| Diffusion Model Guided Sampling with Pixel-Wise Aleatoric Uncertainty Estimation | Nov 29, 2024 | Denoising | CodeCode Available | 1 |
| MaskRIS: Semantic Distortion-aware Data Augmentation for Referring Image Segmentation | Nov 28, 2024 | Data AugmentationImage Segmentation | CodeCode Available | 1 |
| Personalized Federated Fine-Tuning for LLMs via Data-Driven Heterogeneous Model Architectures | Nov 28, 2024 | Federated Learning | CodeCode Available | 1 |
| Bayesian Deconvolution of Astronomical Images with Diffusion Models: Quantifying Prior-Driven Features in Reconstructions | Nov 28, 2024 | | CodeCode Available | 1 |
| AMO Sampler: Enhancing Text Rendering with Overshooting | Nov 28, 2024 | Image GenerationText to Image Generation | CodeCode Available | 1 |
| I Dream My Painting: Connecting MLLMs and Diffusion Models via Prompt Generation for Text-Guided Multi-Mask Inpainting | Nov 28, 2024 | | CodeCode Available | 1 |
| CLIP meets DINO for Tuning Zero-Shot Classifier using Unlabeled Image Collections | Nov 28, 2024 | image-classificationImage Classification | CodeCode Available | 1 |
| Act Now: A Novel Online Forecasting Framework for Large-Scale Streaming Data | Nov 28, 2024 | GPU | CodeCode Available | 1 |
| TAMT: Temporal-Aware Model Tuning for Cross-Domain Few-Shot Action Recognition | Nov 28, 2024 | Action RecognitionCross-Domain Few-Shot | CodeCode Available | 1 |
| Gaussians-to-Life: Text-Driven Animation of 3D Gaussian Splatting Scenes | Nov 28, 2024 | Novel View Synthesis | CodeCode Available | 1 |
| Scaling Particle Collision Data Analysis | Nov 28, 2024 | | CodeCode Available | 1 |
| FonTS: Text Rendering with Typography and Style Controls | Nov 28, 2024 | parameter-efficient fine-tuning | CodeCode Available | 1 |
| Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model | Nov 28, 2024 | DenoisingVideo Generation | CodeCode Available | 1 |
| Global Tensor Motion Planning | Nov 28, 2024 | Dataset GenerationDiversity | CodeCode Available | 1 |
| COMPrompter: reconceptualized segment anything model with multiprompt network for camouflaged object detection | Nov 28, 2024 | object-detectionObject Detection | CodeCode Available | 1 |
| Libra: Leveraging Temporal Images for Biomedical Radiology Analysis | Nov 28, 2024 | | CodeCode Available | 1 |
| A potassium ion channel simulated with a universal neural network potential | Nov 28, 2024 | | CodeCode Available | 1 |
| The more, the better? Evaluating the role of EEG preprocessing for deep learning applications | Nov 27, 2024 | Deep LearningEEG | CodeCode Available | 1 |
| Helvipad: A Real-World Dataset for Omnidirectional Stereo Depth Estimation | Nov 27, 2024 | Depth CompletionDepth Estimation | CodeCode Available | 1 |
| Adaptive Blind All-in-One Image Restoration | Nov 27, 2024 | 5-Degradation Blind All-in-One Image RestorationAll | CodeCode Available | 1 |
| Neural Surface Priors for Editable Gaussian Splatting | Nov 27, 2024 | 3D geometry3D scene Editing | CodeCode Available | 1 |
| Immune: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment | Nov 27, 2024 | Safety AlignmentVisual Reasoning | CodeCode Available | 1 |
| SpotLight: Shadow-Guided Object Relighting via Diffusion | Nov 27, 2024 | Image RelightingNeural Rendering | CodeCode Available | 1 |
| RelCon: Relative Contrastive Learning for a Motion Foundation Model for Wearable Data | Nov 27, 2024 | Activity RecognitionContrastive Learning | CodeCode Available | 1 |
| Causal and Local Correlations Based Network for Multivariate Time Series Classification | Nov 27, 2024 | Graph Neural NetworkTime Series | CodeCode Available | 1 |
| Deep Fourier-embedded Network for Bi-modal Salient Object Detection | Nov 27, 2024 | object-detectionObject Detection | CodeCode Available | 1 |
| Vision Mamba Distillation for Low-resolution Fine-grained Image Classification | Nov 27, 2024 | ClassificationFine-Grained Image Classification | CodeCode Available | 1 |
| MM-Path: Multi-modal, Multi-granularity Path Representation Learning -- Extended Version | Nov 27, 2024 | Representation Learning | CodeCode Available | 1 |
| Streamlining Prediction in Bayesian Deep Learning | Nov 27, 2024 | Deep LearningPrediction | CodeCode Available | 1 |
| Training and Evaluating Language Models with Template-based Data Generation | Nov 27, 2024 | Data AugmentationMath | CodeCode Available | 1 |
| SimCMF: A Simple Cross-modal Fine-tuning Strategy from Vision Foundation Models to Any Imaging Modality | Nov 27, 2024 | cross-modal alignment | CodeCode Available | 1 |
| From Open Vocabulary to Open World: Teaching Vision Language Models to Detect Novel Objects | Nov 27, 2024 | Autonomous DrivingObject | CodeCode Available | 1 |
| MatchDiffusion: Training-free Generation of Match-cuts | Nov 27, 2024 | Denoising | CodeCode Available | 1 |
| Scalable Multi-Objective Reinforcement Learning with Fairness Guarantees using Lorenz Dominance | Nov 27, 2024 | FairnessMulti-Objective Reinforcement Learning | CodeCode Available | 1 |
| Cross-modal Information Flow in Multimodal Large Language Models | Nov 27, 2024 | Question AnsweringVisual Question Answering | CodeCode Available | 1 |
| Enhancing MMDiT-Based Text-to-Image Models for Similar Subject Generation | Nov 27, 2024 | Denoising | CodeCode Available | 1 |
| Spectral-Spatial Transformer with Active Transfer Learning for Hyperspectral Image Classification | Nov 27, 2024 | Active LearningClassification Of Hyperspectral Images | CodeCode Available | 1 |
| Draft Model Knows When to Stop: A Self-Verification Length Policy for Speculative Decoding | Nov 27, 2024 | 8k | CodeCode Available | 1 |
| VideoLLM Knows When to Speak: Enhancing Time-Sensitive Video Comprehension with Video-Text Duet Interaction Format | Nov 27, 2024 | Dense Video CaptioningGrounded Video Question Answering | CodeCode Available | 1 |
| HUPE: Heuristic Underwater Perceptual Enhancement with Semantic Collaborative Learning | Nov 27, 2024 | Image Enhancement | CodeCode Available | 1 |