| Inst-Inpaint: Instructing to Remove Objects with Diffusion Models | Apr 6, 2023 | Image Inpainting | CodeCode Available | 2 |
| DiffMimic: Efficient Motion Mimicking with Differentiable Physics | Apr 6, 2023 | reinforcement-learningReinforcement Learning (RL) | CodeCode Available | 2 |
| Training-Free Layout Control with Cross-Attention Guidance | Apr 6, 2023 | | CodeCode Available | 2 |
| Towards Interpretable Mental Health Analysis with Large Language Models | Apr 6, 2023 | Causal Emotion EntailmentEmotion Recognition | CodeCode Available | 2 |
| ETPNav: Evolving Topological Planning for Vision-Language Navigation in Continuous Environments | Apr 6, 2023 | Autonomous NavigationNavigate | CodeCode Available | 2 |
| Synthesizing Anyone, Anywhere, in Any Pose | Apr 6, 2023 | | CodeCode Available | 2 |
| Detecting and Grounding Multi-Modal Media Manipulation | Apr 5, 2023 | Binary ClassificationContrastive Learning | CodeCode Available | 2 |
| Structured prompt interrogation and recursive extraction of semantics (SPIRES): A method for populating knowledge bases using zero-shot learning | Apr 5, 2023 | Relation ExtractionZero-Shot Learning | CodeCode Available | 2 |
| OrienterNet: Visual Localization in 2D Public Maps with Neural Matching | Apr 4, 2023 | Visual Localization | CodeCode Available | 2 |
| Multimodal Garment Designer: Human-Centric Latent Diffusion Models for Fashion Image Editing | Apr 4, 2023 | Multimodal fashion image editing | CodeCode Available | 2 |
| GlueStick: Robust Image Matching by Sticking Points and Lines Together | Apr 4, 2023 | Graph Neural NetworkPose Estimation | CodeCode Available | 2 |
| Joint 2D-3D Multi-Task Learning on Cityscapes-3D: 3D Detection, Segmentation, and Depth Estimation | Apr 3, 2023 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| On the Benefits of 3D Pose and Tracking for Human Action Recognition | Apr 3, 2023 | Action RecognitionTemporal Action Localization | CodeCode Available | 2 |
| DoctorGLM: Fine-tuning your Chinese Doctor is not a Herculean Task | Apr 3, 2023 | | CodeCode Available | 2 |
| ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model | Apr 3, 2023 | DenoisingDiversity | CodeCode Available | 2 |
| MiniRBT: A Two-stage Distilled Small Chinese Pre-trained Model | Apr 3, 2023 | Machine Reading ComprehensionReading Comprehension | CodeCode Available | 2 |
| RegionPLC: Regional Point-Language Contrastive Learning for Open-World 3D Scene Understanding | Apr 3, 2023 | Contrastive LearningInstance Segmentation | CodeCode Available | 2 |
| Counterfactual Learning on Graphs: A Survey | Apr 3, 2023 | counterfactualFairness | CodeCode Available | 2 |
| Robust Multiview Point Cloud Registration with Reliable Pose Graph Initialization and History Reweighting | Apr 2, 2023 | Point Cloud Registration | CodeCode Available | 2 |
| Leveraging medical Twitter to build a visual–language foundation model for pathology AI | Apr 1, 2023 | Transfer Learning | CodeCode Available | 2 |
| Solving Dynamic Traveling Salesman Problems With Deep Reinforcement Learning | Apr 1, 2023 | Deep Reinforcement Learningreinforcement-learning | CodeCode Available | 2 |
| Self-Supervised Multimodal Learning: A Survey | Mar 31, 2023 | Machine TranslationSelf-Supervised Learning | CodeCode Available | 2 |
| EA-LSS: Edge-aware Lift-splat-shot Framework for 3D BEV Object Detection | Mar 31, 2023 | 3D Object DetectionDepth Estimation | CodeCode Available | 2 |
| GlyphDraw: Seamlessly Rendering Text with Intricate Spatial Structures in Text-to-Image Generation | Mar 31, 2023 | Image GenerationOptical Character Recognition (OCR) | CodeCode Available | 2 |
| PoseFormerV2: Exploring Frequency Domain for Efficient and Robust 3D Human Pose Estimation | Mar 30, 2023 | 3D Human Pose EstimationClassification | CodeCode Available | 2 |
| WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research | Mar 30, 2023 | Audio captioningEvent Detection | CodeCode Available | 2 |
| LayoutDiffusion: Controllable Diffusion Model for Layout-to-image Generation | Mar 30, 2023 | Image GenerationLayout-to-Image Generation | CodeCode Available | 2 |
| PAIR-Diffusion: A Comprehensive Multimodal Object-Level Image Editor | Mar 30, 2023 | Object | CodeCode Available | 2 |
| SynBody: Synthetic Dataset with Layered Human Models for 3D Human Perception and Modeling | Mar 30, 2023 | DiversityHuman Mesh Recovery | CodeCode Available | 2 |
| Zero-Shot Video Editing Using Off-The-Shelf Image Diffusion Models | Mar 30, 2023 | Video AlignmentVideo Editing | CodeCode Available | 2 |
| Hierarchical Fine-Grained Image Forgery Detection and Localization | Mar 30, 2023 | AttributeClassification | CodeCode Available | 2 |
| DDP: Diffusion Model for Dense Visual Prediction | Mar 30, 2023 | DenoisingDepth Estimation | CodeCode Available | 2 |
| NeRF-Supervised Deep Stereo | Mar 30, 2023 | NeRFNeural Rendering | CodeCode Available | 2 |
| 3D Line Mapping Revisited | Mar 30, 2023 | Visual Localization | CodeCode Available | 2 |
| Language Models can Solve Computer Tasks | Mar 30, 2023 | Language ModellingLarge Language Model | CodeCode Available | 2 |
| Pgx: Hardware-Accelerated Parallel Game Simulators for Reinforcement Learning | Mar 29, 2023 | GPUreinforcement-learning | CodeCode Available | 2 |
| VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking | Mar 29, 2023 | Action ClassificationAction Recognition | CodeCode Available | 2 |
| Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert | Mar 29, 2023 | Contrastive LearningFace Generation | CodeCode Available | 2 |
| Implicit Diffusion Models for Continuous Super-Resolution | Mar 29, 2023 | DenoisingImage Super-Resolution | CodeCode Available | 2 |
| Ten Quick Tips for Harnessing the Power of ChatGPT/GPT-4 in Computational Biology | Mar 29, 2023 | ChatbotPrompt Engineering | CodeCode Available | 2 |
| HyperDiffusion: Generating Implicit Neural Fields with Weight-Space Diffusion | Mar 29, 2023 | | CodeCode Available | 2 |
| SparseNeRF: Distilling Depth Ranking for Few-shot Novel View Synthesis | Mar 28, 2023 | NeRFNovel View Synthesis | CodeCode Available | 2 |
| Mask-Free Video Instance Segmentation | Mar 28, 2023 | Instance SegmentationOptical Flow Estimation | CodeCode Available | 2 |
| F^2-NeRF: Fast Neural Radiance Field Training with Free Camera Trajectories | Mar 28, 2023 | NeRFNovel View Synthesis | CodeCode Available | 2 |
| One-Stage 3D Whole-Body Mesh Recovery with Component Aware Transformer | Mar 28, 2023 | 3D Human Pose Estimation3D Human Reconstruction | CodeCode Available | 2 |
| Your Diffusion Model is Secretly a Zero-Shot Classifier | Mar 28, 2023 | Domain GeneralizationFine-Grained Image Classification | CodeCode Available | 2 |
| Efficient Quality Diversity Optimization of 3D Buildings through 2D Pre-optimization | Mar 28, 2023 | Diversity | CodeCode Available | 2 |
| Anti-DreamBooth: Protecting users from personalized text-to-image synthesis | Mar 27, 2023 | Image Generation | CodeCode Available | 2 |
| Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective | Mar 27, 2023 | Image Quality AssessmentNo-Reference Image Quality Assessment | CodeCode Available | 2 |
| Label-Free Liver Tumor Segmentation | Mar 27, 2023 | SegmentationTumor Segmentation | CodeCode Available | 2 |