| Emotionally Enhanced Talking Face Generation | Mar 21, 2023 | Face GenerationTalking Face Generation | CodeCode Available | 2 |
| Large AI Models in Health Informatics: Applications, Challenges, and the Future | Mar 21, 2023 | Decision MakingDrug Discovery | CodeCode Available | 2 |
| BigSmall: Efficient Multi-Task Learning for Disparate Spatial and Temporal Physiological Measurements | Mar 21, 2023 | Multi-Task Learning | CodeCode Available | 2 |
| EmoTalk: Speech-Driven Emotional Disentanglement for 3D Face Animation | Mar 20, 2023 | 3D Face AnimationDecoder | CodeCode Available | 2 |
| Leapfrog Diffusion Model for Stochastic Trajectory Prediction | Mar 20, 2023 | Denoisingmodel | CodeCode Available | 2 |
| Explicit Visual Prompting for Low-Level Structure Segmentations | Mar 20, 2023 | Camouflaged Object SegmentationDefocus Blur Detection | CodeCode Available | 2 |
| M^2SNet: Multi-scale in Multi-scale Subtraction Network for Medical Image Segmentation | Mar 20, 2023 | Computed Tomography (CT)Decoder | CodeCode Available | 2 |
| Generative Semantic Segmentation | Mar 20, 2023 | SegmentationSemantic Segmentation | CodeCode Available | 2 |
| MM-REACT: Prompting ChatGPT for Multimodal Reasoning and Action | Mar 20, 2023 | Multimodal ReasoningVisual Question Answering | CodeCode Available | 2 |
| VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking | Mar 20, 2023 | 3D Object DetectionObject | CodeCode Available | 2 |
| SVDiff: Compact Parameter Space for Diffusion Fine-Tuning | Mar 20, 2023 | Data AugmentationDiffusion Personalization | CodeCode Available | 2 |
| CLIP goes 3D: Leveraging Prompt Tuning for Language Grounded 3D Recognition | Mar 20, 2023 | RetrievalScene Understanding | CodeCode Available | 2 |
| Visual Prompt Multi-Modal Tracking | Mar 20, 2023 | Object TrackingPrompt Learning | CodeCode Available | 2 |
| Deep Learning for Camera Calibration and Beyond: A Survey | Mar 19, 2023 | Camera CalibrationDeep Learning | CodeCode Available | 2 |
| NeRF-LOAM: Neural Implicit Representation for Large-Scale Incremental LiDAR Odometry and Mapping | Mar 19, 2023 | NeRF | CodeCode Available | 2 |
| Large Language Model Instruction Following: A Survey of Progresses and Challenges | Mar 18, 2023 | Instruction FollowingLanguage Modeling | CodeCode Available | 2 |
| Towards Diverse Binary Segmentation via A Simple yet General Gated Network | Mar 18, 2023 | DecoderSegmentation | CodeCode Available | 2 |
| A Dynamic Multi-Scale Voxel Flow Network for Video Prediction | Mar 17, 2023 | Video Prediction | CodeCode Available | 2 |
| FreeDoM: Training-Free Energy-Guided Conditional Diffusion Model | Mar 17, 2023 | Face Detectionmodel | CodeCode Available | 2 |
| A Simple Framework for 3D Occupancy Estimation in Autonomous Driving | Mar 17, 2023 | 3D Object Detection3D Reconstruction | CodeCode Available | 2 |
| SRFormerV2: Taking a Closer Look at Permuted Self-Attention for Image Super-Resolution | Mar 17, 2023 | Image Super-ResolutionSuper-Resolution | CodeCode Available | 2 |
| MedNeXt: Transformer-driven Scaling of ConvNets for Medical Image Segmentation | Mar 17, 2023 | DecoderImage Segmentation | CodeCode Available | 2 |
| Taming Diffusion Models for Audio-Driven Co-Speech Gesture Generation | Mar 16, 2023 | DiversityGesture Generation | CodeCode Available | 2 |
| A Short Survey of Viewing Large Language Models in Legal Aspect | Mar 16, 2023 | | CodeCode Available | 2 |
| DiffIR: Efficient Diffusion Model for Image Restoration | Mar 16, 2023 | DenoisingImage Generation | CodeCode Available | 2 |
| Large Selective Kernel Network for Remote Sensing Object Detection | Mar 16, 2023 | Objectobject-detection | CodeCode Available | 2 |
| DIRE for Diffusion-Generated Image Detection | Mar 16, 2023 | | CodeCode Available | 2 |
| BEVHeight: A Robust Framework for Vision-based Roadside 3D Object Detection | Mar 15, 2023 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| BiFormer: Vision Transformer with Bi-Level Routing Attention | Mar 15, 2023 | Computational EfficiencyGPU | CodeCode Available | 2 |
| VideoFlow: Exploiting Temporal Cues for Multi-frame Optical Flow Estimation | Mar 15, 2023 | Optical Flow EstimationTriplet | CodeCode Available | 2 |
| Skinned Motion Retargeting with Residual Perception of Motion Semantics & Geometry | Mar 15, 2023 | motion retargeting | CodeCode Available | 2 |
| FastInst: A Simple Query-Based Model for Real-Time Instance Segmentation | Mar 15, 2023 | DecoderInstance Segmentation | CodeCode Available | 2 |
| Stochastic Interpolants: A Unifying Framework for Flows and Diffusions | Mar 15, 2023 | Denoising | CodeCode Available | 2 |
| DiffusionAD: Norm-guided One-step Denoising Diffusion for Anomaly Detection | Mar 15, 2023 | Anomaly DetectionDenoising | CodeCode Available | 2 |
| DiffBEV: Conditional Diffusion Model for Bird's Eye View Perception | Mar 15, 2023 | 3D Object DetectionAutonomous Driving | CodeCode Available | 2 |
| SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models | Mar 15, 2023 | Fact CheckingHallucination | CodeCode Available | 2 |
| V2V4Real: A Real-world Large-scale Dataset for Vehicle-to-Vehicle Cooperative Perception | Mar 14, 2023 | 3D Object Detection3D Object Tracking | CodeCode Available | 2 |
| Let 2D Diffusion Model Know 3D-Consistency for Robust Text-to-3D Generation | Mar 14, 2023 | 3D GenerationNeRF | CodeCode Available | 2 |
| Automated Self-Supervised Learning for Recommendation | Mar 14, 2023 | Collaborative FilteringContrastive Learning | CodeCode Available | 2 |
| InstMove: Instance Motion for Object-centric Video Segmentation | Mar 14, 2023 | ObjectOptical Flow Estimation | CodeCode Available | 2 |
| MeshDiffusion: Score-based Generative 3D Mesh Modeling | Mar 14, 2023 | Scene Generation | CodeCode Available | 2 |
| Blind Video Deflickering by Neural Filtering with a Flawed Atlas | Mar 14, 2023 | Video GenerationVideo Temporal Consistency | CodeCode Available | 2 |
| LayoutDM: Discrete Diffusion Model for Controllable Layout Generation | Mar 14, 2023 | Layout Generationmodel | CodeCode Available | 2 |
| Parameter is Not All You Need: Starting from Non-Parametric Networks for 3D Point Cloud Analysis | Mar 14, 2023 | 3D Point Cloud ClassificationAll | CodeCode Available | 2 |
| PMC-CLIP: Contrastive Language-Image Pre-training using Biomedical Documents | Mar 13, 2023 | image-classificationImage Classification | CodeCode Available | 2 |
| DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion | Mar 13, 2023 | Denoising | CodeCode Available | 2 |
| TriDet: Temporal Action Detection with Relative Boundary Modeling | Mar 13, 2023 | Action DetectionTemporal Action Localization | CodeCode Available | 2 |
| FreeNeRF: Improving Few-shot Neural Rendering with Free Frequency Regularization | Mar 13, 2023 | NeRFNeural Rendering | CodeCode Available | 2 |
| Model scale versus domain knowledge in statistical forecasting of chaotic systems | Mar 13, 2023 | Time SeriesTime Series Analysis | CodeCode Available | 2 |
| CrossFormer++: A Versatile Vision Transformer Hinging on Cross-scale Attention | Mar 13, 2023 | image-classificationImage Classification | CodeCode Available | 2 |