| Medical SAM Adapter: Adapting Segment Anything Model for Medical Image Segmentation | Apr 25, 2023 | Image SegmentationMedical Image Segmentation | CodeCode Available | 3 |
| Bake off redux: a review and experimental evaluation of recent time series classification algorithms | Apr 25, 2023 | Dynamic Time WarpingTime Series | CodeCode Available | 3 |
| Text-to-Audio Generation using Instruction-Tuned LLM and Latent Diffusion Model | Apr 24, 2023 | AudioCapsAudio Generation | CodeCode Available | 3 |
| Uni-QSAR: an Auto-ML Tool for Molecular Property Prediction | Apr 24, 2023 | Drug DiscoveryModel Selection | CodeCode Available | 3 |
| Segment Anything in 3D with Radiance Fields | Apr 24, 2023 | Inverse RenderingSegmentation | CodeCode Available | 3 |
| Safety Assessment of Chinese Large Language Models | Apr 20, 2023 | | CodeCode Available | 3 |
| Anything-3D: Towards Single-view Anything Reconstruction in the Wild | Apr 19, 2023 | 3D ReconstructionDiversity | CodeCode Available | 3 |
| Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models | Apr 19, 2023 | Logical Reasoning | CodeCode Available | 3 |
| SAM Fails to Segment Anything? -- SAM-Adapter: Adapting SAM in Underperformed Scenes: Camouflage, Shadow, Medical Image Segmentation, and More | Apr 18, 2023 | General KnowledgeImage Segmentation | CodeCode Available | 3 |
| UniMax: Fairer and more Effective Language Sampling for Large-Scale Multilingual Pretraining | Apr 18, 2023 | | CodeCode Available | 3 |
| Efficient Video Action Detection with Token Dropout and Context Refinement | Apr 17, 2023 | Action DetectionDecoder | CodeCode Available | 3 |
| iDisc: Internal Discretization for Monocular Depth Estimation | Apr 13, 2023 | Autonomous DrivingDepth Estimation | CodeCode Available | 3 |
| HybrIK-X: Hybrid Analytical-Neural Inverse Kinematics for Whole-body Mesh Recovery | Apr 12, 2023 | 3D Human Pose Estimation3D Human Reconstruction | CodeCode Available | 3 |
| ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation | Apr 12, 2023 | Image GenerationPreference Mapping | CodeCode Available | 3 |
| Geometric-aware Pretraining for Vision-centric 3D Object Detection | Apr 6, 2023 | 3D Object DetectionAutonomous Driving | CodeCode Available | 3 |
| AToMiC: An Image/Text Retrieval Test Collection to Support Multimedia Content Creation | Apr 4, 2023 | Cross-Modal RetrievalImage-text Retrieval | CodeCode Available | 3 |
| LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models | Apr 4, 2023 | Arithmetic ReasoningLanguage Modelling | CodeCode Available | 3 |
| Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free Videos | Apr 3, 2023 | Image GenerationText to Image Generation | CodeCode Available | 3 |
| Dataset and Baseline System for Multi-lingual Extraction and Normalization of Temporal and Numerical Expressions | Mar 31, 2023 | Date UnderstandingInformation Retrieval | CodeCode Available | 3 |
| Beyond Appearance: a Semantic Controllable Self-Supervised Learning Framework for Human-Centric Visual Tasks | Mar 30, 2023 | Human ParsingPedestrian Attribute Recognition | CodeCode Available | 3 |
| Self-Refine: Iterative Refinement with Self-Feedback | Mar 30, 2023 | Mathematical ReasoningResponse Generation | CodeCode Available | 3 |
| Sigmoid Loss for Language Image Pre-Training | Mar 27, 2023 | Contrastive LearningDisentanglement | CodeCode Available | 3 |
| A Survey on Causal Discovery Methods for I.I.D. and Time Series Data | Mar 27, 2023 | Causal DiscoveryTime Series | CodeCode Available | 3 |
| A Comprehensive Survey on Test-Time Adaptation under Distribution Shifts | Mar 27, 2023 | Domain AdaptationSource-Free Domain Adaptation | CodeCode Available | 3 |
| Make-It-3D: High-Fidelity 3D Creation from A Single Image with Diffusion Prior | Mar 24, 2023 | 3D geometryText to 3D | CodeCode Available | 3 |
| BundleSDF: Neural 6-DoF Tracking and 3D Reconstruction of Unknown Objects | Mar 24, 2023 | 3D Object Detection3D Object Tracking | CodeCode Available | 3 |
| FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization | Mar 24, 2023 | 3D Hand Pose EstimationGPU | CodeCode Available | 3 |
| PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360^ | Mar 23, 2023 | Image GenerationImage Segmentation | CodeCode Available | 3 |
| EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation | Mar 22, 2023 | 3D Object Detection6D Pose Estimation using RGB | CodeCode Available | 3 |
| Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models | Mar 21, 2023 | 3D geometryText to 3D | CodeCode Available | 3 |
| VAD: Vectorized Scene Representation for Efficient Autonomous Driving | Mar 21, 2023 | Autonomous DrivingBench2Drive | CodeCode Available | 3 |
| SemDeDup: Data-efficient learning at web-scale through semantic deduplication | Mar 16, 2023 | | CodeCode Available | 3 |
| Cross-Modal Causal Intervention for Medical Report Generation | Mar 16, 2023 | Medical Report Generationobject-detection | CodeCode Available | 3 |
| Highly Accurate Quantum Chemical Property Prediction with Uni-Mol+ | Mar 16, 2023 | BenchmarkingGraph Regression | CodeCode Available | 3 |
| FateZero: Fusing Attentions for Zero-shot Text-based Video Editing | Mar 16, 2023 | AttributeText-to-Video Editing | CodeCode Available | 3 |
| SurroundOcc: Multi-Camera 3D Occupancy Prediction for Autonomous Driving | Mar 16, 2023 | 3D Object DetectionAutonomous Driving | CodeCode Available | 3 |
| A Simple Framework for Open-Vocabulary Segmentation and Detection | Mar 14, 2023 | Instance SegmentationPanoptic Segmentation | CodeCode Available | 3 |
| ViperGPT: Visual Inference via Python Execution for Reasoning | Mar 14, 2023 | Code GenerationVideo Question Answering | CodeCode Available | 3 |
| Relational Multi-Task Learning: Modeling Relations between Data and Tasks | Mar 14, 2023 | Multi-Task LearningTransfer Learning | CodeCode Available | 3 |
| One Transformer Fits All Distributions in Multi-Modal Diffusion at Scale | Mar 12, 2023 | AllImage Generation | CodeCode Available | 3 |
| Universal Instance Perception as Object Discovery and Retrieval | Mar 12, 2023 | Described Object DetectionGeneralized Referring Expression Comprehension | CodeCode Available | 3 |
| DINet: Deformation Inpainting Network for Realistic Face Visually Dubbing on High Resolution Video | Mar 7, 2023 | DecoderFace Dubbing | CodeCode Available | 3 |
| Learning Bipedal Walking for Humanoids with Current Feedback | Mar 7, 2023 | Deep Reinforcement LearningReinforcement Learning (RL) | CodeCode Available | 3 |
| Deep symbolic regression for physics guided by units constraints: toward the automated discovery of physical laws | Mar 6, 2023 | Deep Reinforcement Learningregression | CodeCode Available | 3 |
| Unlimited-Size Diffusion Restoration | Mar 1, 2023 | Image GenerationImage Restoration | CodeCode Available | 3 |
| EvoTorch: Scalable Evolutionary Computation in Python | Feb 24, 2023 | GPUreinforcement-learning | CodeCode Available | 3 |
| Deep OC-SORT: Multi-Pedestrian Tracking by Adaptive Re-Identification | Feb 23, 2023 | Multi-Object TrackingObject | CodeCode Available | 3 |
| Modeling Molecular Structures with Intrinsic Diffusion Models | Feb 23, 2023 | Computational chemistryMolecular Docking | CodeCode Available | 3 |
| VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion | Feb 23, 2023 | 3D geometry3D Semantic Scene Completion | CodeCode Available | 3 |
| Vid2Avatar: 3D Avatar Reconstruction from Videos in the Wild via Self-supervised Scene Decomposition | Feb 22, 2023 | 3D Human Reconstructionglobal-optimization | CodeCode Available | 3 |