| GraphAD: Interaction Scene Graph for End-to-end Autonomous Driving | Mar 28, 2024 | Autonomous Driving | CodeCode Available | 2 |
| Instance-Adaptive and Geometric-Aware Keypoint Learning for Category-Level 6D Object Pose Estimation | Mar 28, 2024 | 6D Pose Estimation using RGBKeypoint Detection | CodeCode Available | 2 |
| Top Leaderboard Ranking = Top Coding Proficiency, Always? EvoEval: Evolving Coding Benchmarks via LLM | Mar 28, 2024 | Code GenerationHumanEval | CodeCode Available | 2 |
| Disentangling Length from Quality in Direct Preference Optimization | Mar 28, 2024 | reinforcement-learningReinforcement Learning | CodeCode Available | 2 |
| Multi-Frame, Lightweight & Efficient Vision-Language Models for Question Answering in Autonomous Driving | Mar 28, 2024 | Autonomous DrivingLanguage Modeling | CodeCode Available | 2 |
| Infrared Small Target Detection with Scale and Location Sensitivity | Mar 28, 2024 | Sensitivity | CodeCode Available | 2 |
| DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTs | Mar 28, 2024 | Fine-Grained Image ClassificationImage Classification | CodeCode Available | 2 |
| RecDiffusion: Rectangling for Image Stitching with Diffusion Models | Mar 28, 2024 | Image Stitching | CodeCode Available | 2 |
| Total-Decom: Decomposed 3D Scene Reconstruction with Minimal Interaction | Mar 28, 2024 | 3D geometry3D Reconstruction | CodeCode Available | 2 |
| MoDiTalker: Motion-Disentangled Diffusion Model for High-Fidelity Talking Head Generation | Mar 28, 2024 | Talking Head Generation | CodeCode Available | 2 |
| GlORIE-SLAM: Globally Optimized RGB-only Implicit Encoding Point Cloud SLAM | Mar 28, 2024 | Simultaneous Localization and Mapping | CodeCode Available | 2 |
| Change-Agent: Towards Interactive Comprehensive Remote Sensing Change Interpretation and Analysis | Mar 28, 2024 | Change DetectionLanguage Modelling | CodeCode Available | 2 |
| LITA: Language Instructed Temporal-Localization Assistant | Mar 27, 2024 | Instruction FollowingTemporal Localization | CodeCode Available | 2 |
| IDGenRec: LLM-RecSys Alignment with Textual ID Learning | Mar 27, 2024 | Sequential RecommendationText Generation | CodeCode Available | 2 |
| Mitigating Hallucinations in Large Vision-Language Models with Instruction Contrastive Decoding | Mar 27, 2024 | AttributeDecision Making | CodeCode Available | 2 |
| Generative Medical Segmentation | Mar 27, 2024 | DecoderDomain Generalization | CodeCode Available | 2 |
| DVLO: Deep Visual-LiDAR Odometry with Local-to-Global Feature Fusion and Bi-Directional Structure Alignment | Mar 27, 2024 | | CodeCode Available | 2 |
| A Semi-supervised Nighttime Dehazing Baseline with Spatial-Frequency Aware and Realistic Brightness Constraint | Mar 27, 2024 | Image DehazingPseudo Label | CodeCode Available | 2 |
| Garment3DGen: 3D Garment Stylization and Texture Generation | Mar 27, 2024 | Image to 3DTexture Synthesis | CodeCode Available | 2 |
| Attention Calibration for Disentangled Text-to-Image Personalization | Mar 27, 2024 | Image GenerationNovel Concepts | CodeCode Available | 2 |
| An Image Grid Can Be Worth a Video: Zero-shot Video Question Answering Using a VLM | Mar 27, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| A Diffusion-Based Generative Equalizer for Music Restoration | Mar 27, 2024 | Bandwidth ExtensionHallucination | CodeCode Available | 2 |
| SingularTrajectory: Universal Trajectory Predictor Using Diffusion Model | Mar 27, 2024 | DenoisingDomain Adaptation | CodeCode Available | 2 |
| Mind the Domain Gap: a Systematic Analysis on Bioacoustic Sound Event Detection | Mar 27, 2024 | Data AugmentationDomain Adaptation | CodeCode Available | 2 |
| Dual-path Mamba: Short and Long-term Bidirectional Selective Structured State Space Models for Speech Separation | Mar 27, 2024 | MambaSpeech Separation | CodeCode Available | 2 |