| Ant Colony Sampling with GFlowNets for Combinatorial Optimization | Mar 11, 2024 | Combinatorial Optimization | CodeCode Available | 2 |
| LISO: Lidar-only Self-Supervised 3D Object Detection | Mar 11, 2024 | 3D Object DetectionObject | CodeCode Available | 2 |
| Unsupervised Real-Time Hallucination Detection based on the Internal States of Large Language Models | Mar 11, 2024 | Hallucination | CodeCode Available | 2 |
| Monitoring AI-Modified Content at Scale: A Case Study on the Impact of ChatGPT on AI Conference Peer Reviews | Mar 11, 2024 | Language ModellingLarge Language Model | CodeCode Available | 2 |
| EarthLoc: Astronaut Photography Localization by Indexing Earth from Space | Mar 11, 2024 | Data AugmentationDisaster Response | CodeCode Available | 2 |
| Eliminating Warping Shakes for Unsupervised Online Video Stitching | Mar 11, 2024 | Image StitchingVideo Stabilization | CodeCode Available | 2 |
| Can LLMs Separate Instructions From Data? And What Do We Even Mean By That? | Mar 11, 2024 | Prompt Engineering | CodeCode Available | 2 |
| ERA-CoT: Improving Chain-of-Thought through Entity Relationship Analysis | Mar 11, 2024 | Question Answering | CodeCode Available | 2 |
| Smart-Infinity: Fast Large Language Model Training using Near-Storage Processing on a Real System | Mar 11, 2024 | GPULanguage Modeling | CodeCode Available | 2 |
| Probabilistic Contrastive Learning for Long-Tailed Visual Recognition | Mar 11, 2024 | Long-tail Learning | CodeCode Available | 2 |
| MambaMIL: Enhancing Long Sequence Modeling with Sequence Reordering in Computational Pathology | Mar 11, 2024 | MambaMultiple Instance Learning | CodeCode Available | 2 |
| RA-ISF: Learning to Answer and Understand from Retrieval Augmentation via Iterative Self-Feedback | Mar 11, 2024 | RAGRetrieval | CodeCode Available | 2 |
| CT2Rep: Automated Radiology Report Generation for 3D Medical Imaging | Mar 11, 2024 | | CodeCode Available | 2 |
| The pitfalls of next-token prediction | Mar 11, 2024 | MambaMisconceptions | CodeCode Available | 2 |
| Zero-Shot ECG Classification with Multimodal Learning and Test-time Clinical Knowledge Enhancement | Mar 11, 2024 | Clinical KnowledgeDescriptive | CodeCode Available | 2 |
| DrFuse: Learning Disentangled Representation for Clinical Multi-Modal Fusion with Missing Modality and Modal Inconsistency | Mar 10, 2024 | PredictionPrognosis | CodeCode Available | 2 |
| Poly Kernel Inception Network for Remote Sensing Detection | Mar 10, 2024 | Objectobject-detection | CodeCode Available | 2 |
| VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models | Mar 10, 2024 | Copy DetectionImage Generation | CodeCode Available | 2 |
| V_kD: Improving Knowledge Distillation using Orthogonal Projections | Mar 10, 2024 | Image GenerationKnowledge Distillation | CodeCode Available | 2 |
| RepoHyper: Search-Expand-Refine on Semantic Graphs for Repository-Level Code Completion | Mar 10, 2024 | Code CompletionLink Prediction | CodeCode Available | 2 |
| SAFDNet: A Simple and Effective Network for Fully Sparse 3D Object Detection | Mar 9, 2024 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| Lightning NeRF: Efficient Hybrid Scene Representation for Autonomous Driving | Mar 9, 2024 | Autonomous DrivingNeRF | CodeCode Available | 2 |
| KG-Rank: Enhancing Large Language Models for Medical QA with Knowledge Graphs and Ranking Techniques | Mar 9, 2024 | Knowledge GraphsLong Form Question Answering | CodeCode Available | 2 |
| Long-term Frame-Event Visual Tracking: Benchmark Dataset and Baseline | Mar 9, 2024 | Object TrackingRgb-T Tracking | CodeCode Available | 2 |
| S^2IP-LLM: Semantic Space Informed Prompt Learning with LLM for Time Series Forecasting | Mar 9, 2024 | Prompt LearningTime Series | CodeCode Available | 2 |
| MG-TSD: Multi-Granularity Time Series Diffusion Models with Guided Learning Process | Mar 9, 2024 | Probabilistic Time Series ForecastingTime Series | CodeCode Available | 2 |
| A self-supervised CNN for image watermark removal | Mar 9, 2024 | | CodeCode Available | 2 |
| RFWave: Multi-band Rectified Flow for Audio Waveform Reconstruction | Mar 8, 2024 | Audio GenerationComputational Efficiency | CodeCode Available | 2 |
| Audio-Synchronized Visual Animation | Mar 8, 2024 | | CodeCode Available | 2 |
| FedFMS: Exploring Federated Foundation Models for Medical Image Segmentation | Mar 8, 2024 | Federated LearningImage Segmentation | CodeCode Available | 2 |
| DualBEV: Unifying Dual View Transformation with Probabilistic Correspondences | Mar 8, 2024 | | CodeCode Available | 2 |
| Advanced Millimeter-Wave Radar System for Real-Time Multiple-Human Tracking and Fall Detection | Mar 8, 2024 | Clustering | CodeCode Available | 2 |
| Frequency-Adaptive Dilated Convolution for Semantic Segmentation | Mar 8, 2024 | object-detectionObject Detection | CodeCode Available | 2 |
| Arbitrary-Scale Point Cloud Upsampling by Voxel-Based Network with Latent Geometric-Consistent Learning | Mar 8, 2024 | point cloud upsampling | CodeCode Available | 2 |
| GEAR: An Efficient KV Cache Compression Recipe for Near-Lossless Generative Inference of LLM | Mar 8, 2024 | Quantization | CodeCode Available | 2 |
| Debiasing Multimodal Large Language Models | Mar 8, 2024 | FairnessQuestion Answering | CodeCode Available | 2 |
| StereoDiffusion: Training-Free Stereo Image Generation Using Latent Diffusion Models | Mar 8, 2024 | Image Generation | CodeCode Available | 2 |
| VideoElevator: Elevating Video Generation Quality with Versatile Text-to-Image Diffusion Models | Mar 8, 2024 | Video Generation | CodeCode Available | 2 |
| IsolateGPT: An Execution Isolation Architecture for LLM-Based Agentic Systems | Mar 8, 2024 | | CodeCode Available | 2 |
| Beyond MOT: Semantic Multi-Object Tracking | Mar 8, 2024 | Multi-Object TrackingObject | CodeCode Available | 2 |
| Cross-Modal and Uni-Modal Soft-Label Alignment for Image-Text Retrieval | Mar 8, 2024 | Image-text RetrievalRetrieval | CodeCode Available | 2 |
| Tracking Meets LoRA: Faster Training, Larger Model, Stronger Performance | Mar 8, 2024 | GPUparameter-efficient fine-tuning | CodeCode Available | 2 |
| XPSR: Cross-modal Priors for Diffusion-based Image Super-Resolution | Mar 8, 2024 | Image Super-ResolutionSuper-Resolution | CodeCode Available | 2 |
| Rethinking Transformers Pre-training for Multi-Spectral Satellite Imagery | Mar 8, 2024 | Multi-Label ClassificationMUlTI-LABEL-ClASSIFICATION | CodeCode Available | 2 |
| Face2Diffusion for Fast and Editable Face Personalization | Mar 8, 2024 | Diffusion PersonalizationDiversity | CodeCode Available | 2 |
| HistGen: Histopathology Report Generation via Local-Global Feature Encoding and Cross-modal Context Interaction | Mar 8, 2024 | DiagnosticMedical Report Generation | CodeCode Available | 2 |
| BjTT: A Large-scale Multimodal Dataset for Traffic Prediction | Mar 8, 2024 | PredictionTraffic Prediction | CodeCode Available | 2 |
| QAQ: Quality Adaptive Quantization for LLM KV Cache | Mar 7, 2024 | QuantizationQuestion Answering | CodeCode Available | 2 |
| BAGS: Blur Agnostic Gaussian Splatting through Multi-Scale Kernel Modeling | Mar 7, 2024 | Novel View Synthesis | CodeCode Available | 2 |
| LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error | Mar 7, 2024 | Continual LearningIn-Context Learning | CodeCode Available | 2 |