SOTAVerified

Object

Replace the cat with a British Shorthair cat of the breed with bulging yellow eyes

Papers

Showing 27012750 of 10696 papers

TitleStatusHype
Marmot: Multi-Agent Reasoning for Multi-Object Self-Correcting in Improving Image-Text Alignment0
BoxDreamer: Dreaming Box Corners for Generalizable Object Pose Estimation0
POEM: Precise Object-level Editing via MLLM control0
Glossy Object Reconstruction with Cost-effective Polarized Acquisition0
Compass Control: Multi Object Orientation Control for Text-to-Image Generation0
MovSAM: A Single-image Moving Object Segmentation Framework Based on Deep ThinkingCode0
DLTPose: 6DoF Pose Estimation From Accurate Dense Surface Point Estimates0
Better Decisions through the Right Causal World Model0
A Self-Supervised Framework for Space Object Behaviour Characterisation0
PRIMEDrive-CoT: A Precognitive Chain-of-Thought Framework for Uncertainty-Aware Object Interaction in Driving Scene Scenario0
D-Feat Occlusions: Diffusion Features for Robustness to Partial Visual Occlusions in Object Recognition0
Grounding 3D Object Affordance with Language Instructions, Visual Observations and Interactions0
EMF: Event Meta Formers for Event-based Real-time Traffic Object Detection0
CornerPoint3D: Look at the Nearest Corner Instead of the Center0
RASP: Revisiting 3D Anamorphic Art for Shadow-Guided Packing of Irregular Objects0
Deep Reinforcement Learning via Object-Centric AttentionCode0
COST: Contrastive One-Stage Transformer for Vision-Language Small Object Tracking0
TransforMerger: Transformer-based Voice-Gesture Fusion for Robust Human-Robot Communication0
A Diffusion-Based Framework for Occluded Object Movement0
Slot-Level Robotic Placement via Visual Imitation from Single Human Video0
Towards Unified Referring Expression Segmentation Across Omni-Level Visual Target GranularitiesCode0
Deep LG-Track: An Enhanced Localization-Confidence-Guided Multi-Object Tracker0
MB-ORES: A Multi-Branch Object Reasoner for Visual Grounding in Remote SensingCode0
Detail-aware multi-view stereo network for depth estimationCode0
Physically Ground Commonsense Knowledge for Articulated Object Manipulation with Analytic Concepts0
Object Isolated Attention for Consistent Story Visualization0
ReferDINO-Plus: 2nd Solution for 4th PVUW MeViS Challenge at CVPR 2025Code0
Context in object detection: a systematic literature review0
Efficient Explicit Joint-level Interaction Modeling with Mamba for Text-guided HOI GenerationCode0
Segment then Splat: A Unified Approach for 3D Open-Vocabulary Segmentation based on Gaussian Splatting0
RUNA: Object-level Out-of-Distribution Detection via Regional Uncertainty Alignment of Multimodal Representations0
SIGHT: Single-Image Conditioned Generation of Hand Trajectories for Hand-Object Interaction0
VisTa: Visual-contextual and Text-augmented Zero-shot Object-level OOD Detection0
ForcePose: A Deep Learning Approach for Force Calculation Based on Action Recognition Using MediaPipe Pose Estimation Combined with Object Detection0
SemAlign3D: Semantic Correspondence between RGB-Images through Aligning 3D Object-Class Representations0
TranSplat: Lighting-Consistent Cross-Scene Object Transfer with 3D Gaussian Splatting0
Hyperspectral Adapter for Object Tracking based on Hyperspectral Video0
The Marine Debris Forward-Looking Sonar Datasets0
CTRL-O: Language-Controllable Object-Centric Visual Representation Learning0
AGILE: A Diffusion-Based Attention-Guided Image and Label Translation for Efficient Cross-Domain Plant Trait IdentificationCode0
Visual Jenga: Discovering Object Dependencies via Counterfactual Inpainting0
RelTriple: Learning Plausible Indoor Layouts by Integrating Relationship Triples into the Diffusion Process0
Guiding Human-Object Interactions with Rich Geometry and Relations0
Incremental Object Keypoint Learning0
GLRD: Global-Local Collaborative Reason and Debate with PSL for 3D Open-Vocabulary Detection0
Zero-Shot Human-Object Interaction Synthesis with Multimodal Priors0
Learning 3D Object Spatial Relationships from Pre-trained 2D Diffusion Models0
LayerCraft: Enhancing Text-to-Image Generation with CoT Reasoning and Layered Object IntegrationCode0
Visuo-Tactile Object Pose Estimation for a Multi-Finger Robot Hand with Low-Resolution In-Hand Tactile Sensing0
Beyond Object Categories: Multi-Attribute Reference Understanding for Visual Grounding0
Show:102550
← PrevPage 55 of 214Next →

No leaderboard results yet.