| 3D Interaction Geometric Pre-training for Molecular Relational Learning | Dec 4, 2024 | Contrastive LearningDrug Discovery | CodeCode Available | 1 |
| PatchDPO: Patch-level DPO for Finetuning-free Personalized Image Generation | Dec 4, 2024 | Image GenerationImage Reconstruction | CodeCode Available | 1 |
| MVCTrack: Boosting 3D Point Cloud Tracking via Multimodal-Guided Virtual Cues | Dec 3, 2024 | 3D Single Object TrackingAutonomous Driving | CodeCode Available | 1 |
| Towards Rich Emotions in 3D Avatars: A Text-to-3D Avatar Generation Benchmark | Dec 3, 2024 | Code GenerationDiversity | CodeCode Available | 1 |
| CausalMob: Causal Human Mobility Prediction with LLMs-derived Human Intentions toward Public Events | Dec 3, 2024 | ArticlesCausal Inference | CodeCode Available | 1 |
| Fast LiDAR Data Generation with Rectified Flows | Dec 3, 2024 | | CodeCode Available | 1 |
| What should a neuron aim for? Designing local objective functions based on information theory | Dec 3, 2024 | global-optimization | CodeCode Available | 1 |
| Active Negative Loss: A Robust Framework for Learning with Noisy Labels | Dec 3, 2024 | Image SegmentationLearning with noisy labels | CodeCode Available | 1 |
| TAB-Fields: A Maximum Entropy Framework for Mission-Aware Adversarial Planning | Dec 3, 2024 | | CodeCode Available | 1 |
| Implementing An Artificial Quantum Perceptron | Dec 3, 2024 | | CodeCode Available | 1 |
| Drawing Pandas: A Benchmark for LLMs in Generating Plotting Code | Dec 3, 2024 | | CodeCode Available | 1 |
| Trajectory-based Road Autolabeling with Lidar-Camera Fusion in Winter Conditions | Dec 3, 2024 | Autonomous DrivingRoad Segmentation | CodeCode Available | 1 |
| SimuScope: Realistic Endoscopic Synthetic Dataset Generation through Surgical Simulation and Diffusion Models | Dec 3, 2024 | Dataset GenerationImage-to-Image Translation | CodeCode Available | 1 |
| A Markowitz Approach to Managing a Dynamic Basket of Moving-Band Statistical Arbitrages | Dec 3, 2024 | | CodeCode Available | 1 |
| AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information? | Dec 3, 2024 | Multiple-choice | CodeCode Available | 1 |
| AccDiffusion v2: Towards More Accurate Higher-Resolution Diffusion Extrapolation | Dec 3, 2024 | Image GenerationLocal Distortion | CodeCode Available | 1 |
| RelayGS: Reconstructing Dynamic Scenes with Large-Scale and Complex Motions via Relay Gaussians | Dec 3, 2024 | 3DGS | CodeCode Available | 1 |
| F-SE-LSTM: A Time Series Anomaly Detection Method with Frequency Domain Information | Dec 3, 2024 | Anomaly DetectionTime Series | CodeCode Available | 1 |
| Unlocking Tuning-Free Few-Shot Adaptability in Visual Foundation Models by Recycling Pre-Tuned LoRAs | Dec 3, 2024 | In-Context LearningMeta-Learning | CodeCode Available | 1 |
| EvRT-DETR: Latent Space Adaptation of Image Detectors for Event-based Vision | Dec 3, 2024 | Event-based visionEvent Detection | CodeCode Available | 1 |
| How to Use Diffusion Priors under Sparse Views? | Dec 3, 2024 | 3D ReconstructionNovel View Synthesis | CodeCode Available | 1 |
| Agri-LLaVA: Knowledge-Infused Large Multimodal Assistant on Agricultural Pests and Diseases | Dec 3, 2024 | Instruction Following | CodeCode Available | 1 |
| MERGE: Multi-faceted Hierarchical Graph-based GNN for Gene Expression Prediction from Whole Slide Histopathology Images | Dec 3, 2024 | graph constructionPrediction | CodeCode Available | 1 |
| RG-SAN: Rule-Guided Spatial Awareness Network for End-to-End 3D Referring Expression Segmentation | Dec 3, 2024 | Referring ExpressionReferring Expression Segmentation | CodeCode Available | 1 |
| RARE: Retrieval-Augmented Reasoning Enhancement for Large Language Models | Dec 3, 2024 | Information RetrievalRetrieval | CodeCode Available | 1 |
| ROVER: A Multi-Season Dataset for Visual SLAM | Dec 3, 2024 | Autonomous NavigationOutdoor Localization | CodeCode Available | 1 |
| VideoICL: Confidence-based Iterative In-context Learning for Out-of-Distribution Video Understanding | Dec 3, 2024 | In-Context LearningVideo Understanding | CodeCode Available | 1 |
| Gracefully Filtering Backdoor Samples for Generative Large Language Models without Retraining | Dec 3, 2024 | backdoor defenseComputational Efficiency | CodeCode Available | 1 |
| TDD-Bench Verified: Can LLMs Generate Tests for Issues Before They Get Resolved? | Dec 3, 2024 | test driven development | CodeCode Available | 1 |
| Class-wise Autoencoders Measure Classification Difficulty And Detect Label Mistakes | Dec 3, 2024 | Classification | CodeCode Available | 1 |
| Cross-Attention Head Position Patterns Can Align with Human Visual Concepts in Text-to-Image Generative Models | Dec 3, 2024 | Image GenerationPosition | CodeCode Available | 1 |
| Dual-Branch Graph Transformer Network for 3D Human Mesh Reconstruction from Video | Dec 2, 2024 | | CodeCode Available | 1 |
| COSMOS: Cross-Modality Self-Distillation for Vision Language Pre-training | Dec 2, 2024 | Self-Supervised LearningSemantic Segmentation | CodeCode Available | 1 |
| ShowHowTo: Generating Scene-Conditioned Step-by-Step Visual Instructions | Dec 2, 2024 | | CodeCode Available | 1 |
| Align-KD: Distilling Cross-Modal Alignment Knowledge for Mobile Vision-Language Model | Dec 2, 2024 | cross-modal alignmentKnowledge Distillation | CodeCode Available | 1 |
| Understanding Bias in Large-Scale Visual Datasets | Dec 2, 2024 | | CodeCode Available | 1 |
| Phaseformer: Phase-based Attention Mechanism for Underwater Image Restoration and Beyond | Dec 2, 2024 | Image EnhancementImage Restoration | CodeCode Available | 1 |
| The Landscape of Causal Discovery Data: Grounding Causal Discovery in Real-World Applications | Dec 2, 2024 | Causal Discovery | CodeCode Available | 1 |
| MBA-RAG: a Bandit Approach for Adaptive Retrieval-Augmented Generation through Question Complexity | Dec 2, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Down with the Hierarchy: The 'H' in HNSW Stands for "Hubs" | Dec 2, 2024 | BenchmarkingRepresentation Learning | CodeCode Available | 1 |
| DiffPatch: Generating Customizable Adversarial Patches using Diffusion Model | Dec 2, 2024 | model | CodeCode Available | 1 |
| Multi-Scale Representation Learning for Protein Fitness Prediction | Dec 2, 2024 | PredictionProtein Language Model | CodeCode Available | 1 |
| GraphOTTER: Evolving LLM-based Graph Reasoning for Complex Table Question Answering | Dec 2, 2024 | Question Answering | CodeCode Available | 1 |
| MambaU-Lite: A Lightweight Model based on Mamba and Integrated Channel-Spatial Attention for Skin Lesion Segmentation | Dec 2, 2024 | DiagnosticLesion Segmentation | CodeCode Available | 1 |
| NLLG Quarterly arXiv Report 09/24: What are the most influential current AI Papers? | Dec 2, 2024 | State Space Models | CodeCode Available | 1 |
| IQA-Adapter: Exploring Knowledge Transfer from Image Quality Assessment to Diffusion-based Generative Models | Dec 2, 2024 | Adversarial RobustnessConditional Image Generation | CodeCode Available | 1 |
| Swin Transformer with Enhanced Dropout and Layer-wise Unfreezing for Facial Expression Recognition in Mental Health Detection | Dec 2, 2024 | Emotion RecognitionFacial Emotion Recognition | CodeCode Available | 1 |
| BroadTrack: Broadcast Camera Tracking for Soccer | Dec 2, 2024 | Camera Calibration | CodeCode Available | 1 |
| VideoLights: Feature Refinement and Cross-Task Alignment Transformer for Joint Video Highlight Detection and Moment Retrieval | Dec 2, 2024 | Highlight DetectionMoment Retrieval | CodeCode Available | 1 |
| Learning Structured Representations with Hyperbolic Embeddings | Dec 2, 2024 | Out of Distribution (OOD) DetectionRepresentation Learning | CodeCode Available | 1 |