| Learning to Act from Actionless Videos through Dense Correspondences | Oct 12, 2023 | | CodeCode Available | 2 |
| Effective Long-Context Scaling of Foundation Models | Sep 27, 2023 | Continual PretrainingLanguage Modeling | CodeCode Available | 2 |
| DehazeDCT: Towards Effective Non-Homogeneous Dehazing via Deformable Convolutional Transformer | Jun 12, 2024 | Image DehazingNonhomogeneous Image Dehazing | CodeCode Available | 2 |
| What Matters in Training a GPT4-Style Language Model with Multimodal Inputs? | Jul 5, 2023 | Instruction FollowingLanguage Modeling | CodeCode Available | 2 |
| Palette: Image-to-Image Diffusion Models | Nov 10, 2021 | ColorizationDenoising | CodeCode Available | 2 |
| EVA-GAN: Enhanced Various Audio Generation via Scalable Generative Adversarial Networks | Jan 31, 2024 | Audio GenerationSpeech Synthesis | CodeCode Available | 2 |
| PaLM: Scaling Language Modeling with Pathways | Apr 5, 2022 | Auto DebuggingCode Generation | CodeCode Available | 2 |
| RPN 2: On Interdependence Function Learning Towards Unifying and Advancing CNN, RNN, GNN, and Transformer | Nov 17, 2024 | | CodeCode Available | 2 |
| TIPS: Text-Image Pretraining with Spatial Awareness | Oct 21, 2024 | Depth EstimationImage Captioning | CodeCode Available | 2 |
| Equivariance and partial observations in Koopman operator theory for partial differential equations | Jul 28, 2023 | | CodeCode Available | 2 |
| Decouple and Track: Benchmarking and Improving Video Diffusion Transformers for Motion Transfer | Mar 21, 2025 | BenchmarkingVideo Generation | CodeCode Available | 2 |
| cadrille: Multi-modal CAD Reconstruction with Online Reinforcement Learning | May 28, 2025 | CAD ReconstructionLarge Language Model | CodeCode Available | 2 |
| Fast protein backbone generation with SE(3) flow matching | Oct 8, 2023 | Protein Design | CodeCode Available | 2 |
| DeepMol: An Automated Machine and Deep Learning Framework for Computational Chemistr | Jun 1, 2024 | Activity PredictionAutoML | CodeCode Available | 2 |
| Immiscible Diffusion: Accelerating Diffusion Training with Noise Assignment | Jun 18, 2024 | Denoising | CodeCode Available | 2 |
| SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation | Oct 16, 2024 | DenoisingVideo Generation | CodeCode Available | 2 |
| Remasking Discrete Diffusion Models with Inference-Time Scaling | Mar 1, 2025 | | CodeCode Available | 2 |
| SCoralDet: Efficient real-time underwater soft coral detection with YOLO | Dec 16, 2024 | 2D Object Detectionobject-detection | CodeCode Available | 2 |
| Simplifying, Stabilizing and Scaling Continuous-Time Consistency Models | Oct 14, 2024 | | CodeCode Available | 2 |
| GestureDiffuCLIP: Gesture Diffusion Model with CLIP Latents | Mar 26, 2023 | Contrastive LearningGesture Generation | CodeCode Available | 2 |
| JourneyDB: A Benchmark for Generative Image Understanding | Jul 3, 2023 | Image CaptioningImage Comprehension | CodeCode Available | 2 |
| X-maps: Direct Depth Lookup for Event-based Structured Light Systems | Feb 15, 2024 | Depth EstimationDisparity Estimation | CodeCode Available | 2 |
| CLIPZyme: Reaction-Conditioned Virtual Screening of Enzymes | Feb 9, 2024 | | CodeCode Available | 2 |
| RegionDrag: Fast Region-Based Image Editing with Diffusion Models | Jul 25, 2024 | | CodeCode Available | 2 |
| LiteVLoc: Map-Lite Visual Localization for Image Goal Navigation | Oct 6, 2024 | Pose EstimationVisual Localization | CodeCode Available | 2 |
| The 1st-place Solution for CVPR 2023 OpenLane Topology in Autonomous Driving Challenge | Jun 16, 2023 | Autonomous Driving | CodeCode Available | 2 |
| SECOND: Sparsely Embedded Convolutional Detection | Oct 6, 2018 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| Adversarial Attacks and Defenses in Images, Graphs and Text: A Review | Sep 17, 2019 | Adversarial Attack | CodeCode Available | 2 |
| Zip-NeRF: Anti-Aliased Grid-Based Neural Radiance Fields | Apr 13, 2023 | NeRFNovel View Synthesis | CodeCode Available | 2 |
| Differential Dynamic Programming for Multi-Phase Rigid Contact Dynamics | Apr 10, 2019 | | CodeCode Available | 2 |
| Learning Implicit Surface Light Fields | Mar 27, 2020 | 3D ReconstructionImage Generation | CodeCode Available | 2 |
| SOOD++: Leveraging Unlabeled Data to Boost Oriented Object Detection | Jul 1, 2024 | Objectobject-detection | CodeCode Available | 2 |
| Compressed Image Generation with Denoising Diffusion Codebook Models | Feb 3, 2025 | Conditional Image GenerationDenoising | CodeCode Available | 2 |
| Neural Network Libraries: A Deep Learning Framework Designed from Engineers' Perspectives | Feb 12, 2021 | Deep Learning | CodeCode Available | 2 |
| A large annotated medical image dataset for the development and evaluation of segmentation algorithms | Feb 25, 2019 | BenchmarkingSegmentation | CodeCode Available | 2 |
| Any-resolution Training for High-resolution Image Synthesis | Apr 14, 2022 | 2kImage Generation | CodeCode Available | 2 |
| Neural 3D Reconstruction in the Wild | May 25, 2022 | 3D ReconstructionSurface Reconstruction | CodeCode Available | 2 |
| Differentiable Image Parameterizations | Jul 25, 2018 | Image Generation | CodeCode Available | 2 |
| OpenScene: 3D Scene Understanding with Open Vocabularies | Nov 28, 2022 | 3D Open-Vocabulary Instance Segmentation3D Semantic Segmentation | CodeCode Available | 2 |
| Diverse Preference Optimization | Jan 30, 2025 | Diversity | CodeCode Available | 2 |
| Learning Practically Feasible Policies for Online 3D Bin Packing | Aug 31, 2021 | 3D Bin PackingCollision Avoidance | CodeCode Available | 2 |
| Beyond Fixed Topologies: Unregistered Training and Comprehensive Evaluation Metrics for 3D Talking Heads | Oct 14, 2024 | Talking Head Generation | CodeCode Available | 2 |
| Mixture of A Million Experts | Jul 4, 2024 | Computational EfficiencyLanguage Modeling | CodeCode Available | 2 |
| MoreFixes: A Large-Scale Dataset of CVE Fix Commits Mined through Enhanced Repository Discovery | Jul 10, 2024 | Vulnerability Detection | CodeCode Available | 2 |
| DexGraspNet: A Large-Scale Robotic Dexterous Grasp Dataset for General Objects Based on Simulation | Oct 6, 2022 | Object | CodeCode Available | 2 |
| BPR: Bayesian Personalized Ranking from Implicit Feedback | May 9, 2012 | | CodeCode Available | 2 |
| 4DRadarSLAM: A 4D Imaging Radar SLAM System for Large-scale Environments based on Pose Graph Optimization | May 29, 2023 | | CodeCode Available | 2 |
| DeSTA2: Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data | Sep 30, 2024 | Instruction FollowingLanguage Modeling | CodeCode Available | 2 |
| Classifier-Free Diffusion Guidance | Jul 26, 2022 | Diversity | CodeCode Available | 2 |
| AICircuit: A Multi-Level Dataset and Benchmark for AI-Driven Analog Integrated Circuit Design | Jul 22, 2024 | | CodeCode Available | 2 |