| Generative AI Enables Medical Image Segmentation in Ultra Low-Data Regimes | Aug 30, 2024 | Deep LearningImage Segmentation | CodeCode Available | 2 |
| UTrack: Multi-Object Tracking with Uncertain Detections | Aug 30, 2024 | Autonomous DrivingMulti-Object Tracking | CodeCode Available | 2 |
| Self-supervised Anomaly Detection Pretraining Enhances Long-tail ECG Diagnosis | Aug 30, 2024 | Anomaly DetectionDiagnostic | CodeCode Available | 2 |
| MAPF-GPT: Imitation Learning for Multi-Agent Pathfinding at Scale | Aug 29, 2024 | Deep Reinforcement LearningImitation Learning | CodeCode Available | 2 |
| Law of Vision Representation in MLLMs | Aug 29, 2024 | cross-modal alignmentLanguage Modeling | CodeCode Available | 2 |
| Toward Robust Early Detection of Alzheimer's Disease via an Integrated Multimodal Learning Approach | Aug 29, 2024 | DiagnosticEEG | CodeCode Available | 2 |
| UV-free Texture Generation with Denoising and Geodesic Heat Diffusions | Aug 29, 2024 | DenoisingTexture Synthesis | CodeCode Available | 2 |
| GRPose: Learning Graph Relations for Human Image Generation with Pose Priors | Aug 29, 2024 | Image GenerationPose Estimation | CodeCode Available | 2 |
| Spiking Diffusion Models | Aug 29, 2024 | Image Generation | CodeCode Available | 2 |
| Interactive Agents: Simulating Counselor-Client Psychological Counseling via Role-Playing LLM-to-LLM Interactions | Aug 28, 2024 | Benchmarking | CodeCode Available | 2 |
| chemtrain: Learning Deep Potential Models via Automatic Differentiation and Statistical Physics | Aug 28, 2024 | | CodeCode Available | 2 |
| RoboSense: Large-scale Dataset and Benchmark for Egocentric Robot Perception and Navigation in Crowded and Unstructured Environments | Aug 28, 2024 | Autonomous DrivingAutonomous Navigation | CodeCode Available | 2 |
| CSAD: Unsupervised Component Segmentation for Logical Anomaly Detection | Aug 28, 2024 | Anomaly DetectionSegmentation | CodeCode Available | 2 |
| In-Context Imitation Learning via Next-Token Prediction | Aug 28, 2024 | Imitation LearningPrediction | CodeCode Available | 2 |
| Perceive-IR: Learning to Perceive Degradation Better for All-in-One Image Restoration | Aug 28, 2024 | AllImage Restoration | CodeCode Available | 2 |
| Efficient LLM Scheduling by Learning to Rank | Aug 28, 2024 | BlockingChatbot | CodeCode Available | 2 |
| Learning Harmonized Representations for Speculative Sampling | Aug 28, 2024 | | CodeCode Available | 2 |
| Unleashing the Temporal-Spatial Reasoning Capacity of GPT for Training-Free Audio and Language Referenced Video Object Segmentation | Aug 28, 2024 | ObjectSemantic Segmentation | CodeCode Available | 2 |
| SciLitLLM: How to Adapt LLMs for Scientific Literature Understanding | Aug 28, 2024 | Instruction Followingscientific discovery | CodeCode Available | 2 |
| Towards Real-world Event-guided Low-light Video Enhancement and Deblurring | Aug 27, 2024 | DeblurringVideo Enhancement | CodeCode Available | 2 |
| Drone-assisted Road Gaussian Splatting with Cross-view Uncertainty | Aug 27, 2024 | Autonomous DrivingNeural Rendering | CodeCode Available | 2 |
| NeuroLM: A Universal Multi-task Foundation Model for Bridging the Gap between Language and EEG Signals | Aug 27, 2024 | EEGElectroencephalogram (EEG) | CodeCode Available | 2 |
| Leveraging Hallucinations to Reduce Manual Prompt Dependency in Promptable Segmentation | Aug 27, 2024 | Camouflaged Object SegmentationCamouflaged Object Segmentation with a Single Task-generic Prompt | CodeCode Available | 2 |
| LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet | Aug 27, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| Latent Ewald summation for machine learning of long-range interactions | Aug 27, 2024 | | CodeCode Available | 2 |
| Writing in the Margins: Better Inference Pattern for Long Context Retrieval | Aug 27, 2024 | RAGRetrieval | CodeCode Available | 2 |
| RAW-Adapter: Adapting Pre-trained Visual Model to Camera RAW Images | Aug 27, 2024 | | CodeCode Available | 2 |
| SAM & SAM 2 in 3D Slicer: SegmentWithSAM Extension for Annotating Medical Images | Aug 27, 2024 | | CodeCode Available | 2 |
| Alfie: Democratising RGBA Image Generation With No $ | Aug 27, 2024 | Image GenerationImage Matting | CodeCode Available | 2 |
| HPT++: Hierarchically Prompting Vision-Language Models with Multi-Granularity Knowledge Generation and Improved Structure Modeling | Aug 27, 2024 | Domain GeneralizationPrompt Engineering | CodeCode Available | 2 |
| A Practitioner's Guide to Continual Multimodal Pretraining | Aug 26, 2024 | Continual LearningContinual Pretraining | CodeCode Available | 2 |
| Video-CCAM: Enhancing Video-Language Understanding with Causal Cross-Attention Masks for Short and Long Videos | Aug 26, 2024 | Large Language ModelMVBench | CodeCode Available | 2 |
| CURLoRA: Stable LLM Continual Fine-Tuning and Catastrophic Forgetting Mitigation | Aug 26, 2024 | Continual Learning | CodeCode Available | 2 |
| Training-Free Activation Sparsity in Large Language Models | Aug 26, 2024 | Quantization | CodeCode Available | 2 |
| MLR-Copilot: Autonomous Machine Learning Research based on Large Language Models Agents | Aug 26, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| GR-MG: Leveraging Partially Annotated Data via Multi-Modal Goal-Conditioned Policy | Aug 26, 2024 | Few-Shot LearningImage Generation | CodeCode Available | 2 |
| MSVM-UNet: Multi-Scale Vision Mamba UNet for Medical Image Segmentation | Aug 25, 2024 | Image SegmentationMamba | CodeCode Available | 2 |
| LLMs as Zero-shot Graph Learners: Alignment of GNN Representations with LLM Token Embeddings | Aug 25, 2024 | Language ModellingLink Prediction | CodeCode Available | 2 |
| SceneDreamer360: Text-Driven 3D-Consistent Scene Generation with Panoramic Gaussian Splatting | Aug 25, 2024 | 3DGSImage Generation | CodeCode Available | 2 |
| MobileQuant: Mobile-friendly Quantization for On-device Language Models | Aug 25, 2024 | Quantization | CodeCode Available | 2 |
| TripleMixer: A 3D Point Cloud Denoising Model for Adverse Weather | Aug 25, 2024 | Autonomous DrivingDenoising | CodeCode Available | 2 |
| 3D-RCNet: Learning from Transformer to Build a 3D Relational ConvNet for Hyperspectral Image Classification | Aug 25, 2024 | Computational EfficiencyHyperspectral Image Classification | CodeCode Available | 2 |
| DualAnoDiff: Dual-Interrelated Diffusion Model for Few-Shot Anomaly Image Generation | Aug 24, 2024 | Anomaly ClassificationAnomaly Detection | CodeCode Available | 2 |
| Segment Any Mesh: Zero-shot Mesh Part Segmentation via Lifting Segment Anything 2 to 3D | Aug 24, 2024 | DiversitySegmentation | CodeCode Available | 2 |
| SpeechCraft: A Fine-grained Expressive Speech Dataset with Natural Language Description | Aug 24, 2024 | DescriptiveSpeech Synthesis | CodeCode Available | 2 |
| Data-Driven Parametrization of Molecular Mechanics Force Fields for Expansive Chemical Space Coverage | Aug 23, 2024 | Computational EfficiencyDrug Discovery | CodeCode Available | 2 |
| WildFusion: Individual Animal Identification with Calibrated Similarity Fusion | Aug 23, 2024 | | CodeCode Available | 2 |
| CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities | Aug 23, 2024 | DenoisingMotion Generation | CodeCode Available | 2 |
| LLM-PBE: Assessing Data Privacy in Large Language Models | Aug 23, 2024 | | CodeCode Available | 2 |
| Image Segmentation in Foundation Model Era: A Survey | Aug 23, 2024 | Image SegmentationInstance Segmentation | CodeCode Available | 2 |