| Open-vocabulary object 6D pose estimation | Dec 1, 2023 | 6D Pose EstimationLanguage Modelling | —Unverified | 0 |
| SPOT: Self-Training with Patch-Order Permutation for Object-Centric Learning with Autoregressive Transformers | Dec 1, 2023 | DecoderObject | CodeCode Available | 1 |
| Gaussian Grouping: Segment and Edit Anything in 3D Scenes | Dec 1, 2023 | ColorizationNeRF | CodeCode Available | 2 |
| TrackDiffusion: Tracklet-Conditioned Video Generation via Diffusion Models | Dec 1, 2023 | Image ClassificationMulti-Object Tracking | CodeCode Available | 2 |
| Lasagna: Layered Score Distillation for Disentangled Object Relighting | Nov 30, 2023 | ColorizationObject | CodeCode Available | 1 |
| LucidDreaming: Controllable Object-Centric 3D Generation | Nov 30, 2023 | 3D GenerationBenchmarking | —Unverified | 0 |
| TIDE: Test Time Few Shot Object Detection | Nov 30, 2023 | Data AugmentationFew-Shot Object Detection | CodeCode Available | 0 |
| TrafficMOT: A Challenging Dataset for Multi-Object Tracking in Complex Traffic Scenarios | Nov 30, 2023 | Multi-Object TrackingObject | —Unverified | 0 |
| SimulFlow: Simultaneously Extracting Feature and Identifying Target for Unsupervised Video Object Segmentation | Nov 30, 2023 | Objectobject-detection | —Unverified | 0 |
| A Simple Video Segmenter by Tracking Objects Along Axial Trajectories | Nov 30, 2023 | GPUObject | CodeCode Available | 1 |
| Is Underwater Image Enhancement All Object Detectors Need? | Nov 30, 2023 | AllImage Enhancement | CodeCode Available | 1 |
| HOLD: Category-agnostic 3D Reconstruction of Interacting Hands and Objects from Video | Nov 30, 2023 | 3D ReconstructionObject | CodeCode Available | 2 |
| FoundPose: Unseen Object Pose Estimation with Foundation Features | Nov 30, 2023 | 6D Pose EstimationObject | —Unverified | 0 |
| Hy-Tracker: A Novel Framework for Enhancing Efficiency and Accuracy of Object Tracking in Hyperspectral Videos | Nov 30, 2023 | Objectobject-detection | —Unverified | 0 |
| Union-over-Intersections: Object Detection beyond Winner-Takes-All | Nov 30, 2023 | AllInstance Segmentation | CodeCode Available | 0 |
| Informal Safety Guarantees for Simulated Optimizers Through Extrapolation from Partial Simulations | Nov 29, 2023 | Language ModelingLanguage Modelling | —Unverified | 0 |
| A Stochastic-Geometrical Framework for Object Pose Estimation based on Mixture Models Avoiding the Correspondence Problem | Nov 29, 2023 | ObjectPose Estimation | —Unverified | 0 |
| Object-based (yet Class-agnostic) Video Domain Adaptation | Nov 29, 2023 | Action RecognitionDomain Adaptation | —Unverified | 0 |
| HiDiffusion: Unlocking Higher-Resolution Creativity and Efficiency in Pretrained Diffusion Models | Nov 29, 2023 | AttributeImage Generation | —Unverified | 0 |
| Leveraging VLM-Based Pipelines to Annotate 3D Objects | Nov 29, 2023 | In-Context LearningLanguage Modeling | —Unverified | 0 |
| A Graph-Based Approach for Category-Agnostic Pose Estimation | Nov 29, 2023 | 2D Pose EstimationAnimal Pose Estimation | CodeCode Available | 2 |
| The devil is in the fine-grained details: Evaluating open-vocabulary object detectors for fine-grained understanding | Nov 29, 2023 | Objectobject-detection | CodeCode Available | 1 |
| CG3D: Compositional Generation for Text-to-3D via Gaussian Splatting | Nov 29, 2023 | 3D GenerationObject | —Unverified | 0 |
| Weakly-semi-supervised object detection in remotely sensed imagery | Nov 29, 2023 | Objectobject-detection | —Unverified | 0 |
| StructRe: Rewriting for Structured Shape Modeling | Nov 29, 2023 | Object | —Unverified | 0 |
| RQFormer: Rotated Query Transformer for End-to-End Oriented Object Detection | Nov 29, 2023 | DecoderObject | CodeCode Available | 1 |
| Betrayed by Attention: A Simple yet Effective Approach for Self-supervised Video Object Segmentation | Nov 29, 2023 | ClusteringObject | CodeCode Available | 1 |
| Large Model Based Referring Camouflaged Object Detection | Nov 28, 2023 | modelObject | —Unverified | 0 |
| CLiC: Concept Learning in Context | Nov 28, 2023 | Object | —Unverified | 0 |
| DyRA: Portable Dynamic Resolution Adjustment Network for Existing Detectors | Nov 28, 2023 | Objectobject-detection | CodeCode Available | 0 |
| Feedback RoI Features Improve Aerial Object Detection | Nov 28, 2023 | feature selectionObject | —Unverified | 0 |
| DepthSSC: Monocular 3D Semantic Scene Completion via Depth-Spatial Alignment and Voxel Adaptation | Nov 28, 2023 | 3D Semantic Scene CompletionAutonomous Driving | —Unverified | 0 |
| UGG: Unified Generative Grasping | Nov 28, 2023 | Grasp GenerationObject | CodeCode Available | 1 |
| Point'n Move: Interactive Scene Object Manipulation on Gaussian Splatting Radiance Fields | Nov 28, 2023 | Object | —Unverified | 0 |
| Image segmentation with traveling waves in an exactly solvable recurrent neural network | Nov 28, 2023 | Image SegmentationObject | —Unverified | 0 |
| HandyPriors: Physically Consistent Perception of Hand-Object Interactions with Differentiable Priors | Nov 28, 2023 | Human-Object Interaction DetectionObject | —Unverified | 0 |
| Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding | Nov 28, 2023 | HallucinationObject | CodeCode Available | 2 |
| Segment Every Out-of-Distribution Object | Nov 27, 2023 | ObjectSegmentation | CodeCode Available | 1 |
| SAM-6D: Segment Anything Model Meets Zero-Shot 6D Object Pose Estimation | Nov 27, 2023 | 6D Pose Estimation using RGBInstance Segmentation | CodeCode Available | 2 |
| EVCap: Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension | Nov 27, 2023 | Image CaptioningObject | —Unverified | 0 |
| Single-Model and Any-Modality for Video Object Tracking | Nov 27, 2023 | ObjectObject Tracking | CodeCode Available | 1 |
| CG-HOI: Contact-Guided 3D Human-Object Interaction Generation | Nov 27, 2023 | Human-Object Interaction DetectionHuman-Object Interaction Generation | —Unverified | 0 |
| Obj-NeRF: Extract Object NeRFs from Multi-view Images | Nov 26, 2023 | 3D geometry3D Reconstruction | —Unverified | 0 |
| Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding | Nov 26, 2023 | 3D visual groundingObject | CodeCode Available | 1 |
| OpenNet: Incremental Learning for Autonomous Driving Object Detection with Balanced Loss | Nov 25, 2023 | Autonomous DrivingIncremental Learning | —Unverified | 0 |
| Point2RBox: Combine Knowledge from Synthetic Visual Patterns for End-to-end Oriented Object Detection with Single Point Supervision | Nov 23, 2023 | Objectobject-detection | CodeCode Available | 1 |
| PointOBB: Learning Oriented Object Detection via Single Point Supervision | Nov 23, 2023 | Objectobject-detection | CodeCode Available | 1 |
| D-SCo: Dual-Stream Conditional Diffusion for Monocular Hand-Held Object Reconstruction | Nov 23, 2023 | DenoisingObject | —Unverified | 0 |
| GS-Pose: Category-Level Object Pose Estimation via Geometric and Semantic Correspondence | Nov 23, 2023 | ObjectPose Estimation | —Unverified | 0 |
| Boosting3D: High-Fidelity Image-to-3D by Boosting 2D Diffusion Prior to 3D Prior with Progressive Learning | Nov 22, 2023 | 3D GenerationImage to 3D | —Unverified | 0 |