SOTAVerified

Object

Replace the cat with a British Shorthair cat of the breed with bulging yellow eyes

Papers

Showing 251300 of 10696 papers

TitleStatusHype
TransforMerger: Transformer-based Voice-Gesture Fusion for Robust Human-Robot Communication0
A Diffusion-Based Framework for Occluded Object Movement0
v-CLR: View-Consistent Learning for Open-World Instance SegmentationCode1
Deep LG-Track: An Enhanced Localization-Confidence-Guided Multi-Object Tracker0
Towards Unified Referring Expression Segmentation Across Omni-Level Visual Target GranularitiesCode0
Detail-aware multi-view stereo network for depth estimationCode0
MB-ORES: A Multi-Branch Object Reasoner for Visual Grounding in Remote SensingCode0
Physically Ground Commonsense Knowledge for Articulated Object Manipulation with Analytic Concepts0
EagleVision: Object-level Attribute Multimodal LLM for Remote SensingCode1
Object Isolated Attention for Consistent Story Visualization0
ReferDINO-Plus: 2nd Solution for 4th PVUW MeViS Challenge at CVPR 2025Code0
DASH: Detection and Assessment of Systematic Hallucinations of VLMsCode1
Context in object detection: a systematic literature review0
Efficient Explicit Joint-level Interaction Modeling with Mamba for Text-guided HOI GenerationCode0
Hyperspectral Adapter for Object Tracking based on Hyperspectral Video0
The Marine Debris Forward-Looking Sonar Datasets0
SIGHT: Single-Image Conditioned Generation of Hand Trajectories for Hand-Object Interaction0
ForcePose: A Deep Learning Approach for Force Calculation Based on Action Recognition Using MediaPipe Pose Estimation Combined with Object Detection0
SemAlign3D: Semantic Correspondence between RGB-Images through Aligning 3D Object-Class Representations0
Segment then Splat: A Unified Approach for 3D Open-Vocabulary Segmentation based on Gaussian Splatting0
TranSplat: Lighting-Consistent Cross-Scene Object Transfer with 3D Gaussian Splatting0
VisTa: Visual-contextual and Text-augmented Zero-shot Object-level OOD Detection0
RUNA: Object-level Out-of-Distribution Detection via Regional Uncertainty Alignment of Multimodal Representations0
AGILE: A Diffusion-Based Attention-Guided Image and Label Translation for Efficient Cross-Domain Plant Trait IdentificationCode0
BOOTPLACE: Bootstrapped Object Placement with Detection TransformersCode1
Visual Jenga: Discovering Object Dependencies via Counterfactual Inpainting0
Learning Class Prototypes for Unified Sparse Supervised 3D Object DetectionCode1
CTRL-O: Language-Controllable Object-Centric Visual Representation Learning0
RelTriple: Learning Plausible Indoor Layouts by Integrating Relationship Triples into the Diffusion Process0
GLRD: Global-Local Collaborative Reason and Debate with PSL for 3D Open-Vocabulary Detection0
Incremental Object Keypoint Learning0
Guiding Human-Object Interactions with Rich Geometry and Relations0
LayerCraft: Enhancing Text-to-Image Generation with CoT Reasoning and Layered Object IntegrationCode0
DynOPETs: A Versatile Benchmark for Dynamic Object Pose Estimation and Tracking in Moving Camera ScenariosCode1
Learning 3D Object Spatial Relationships from Pre-trained 2D Diffusion Models0
COB-GS: Clear Object Boundaries in 3DGS Segmentation Based on Boundary-Adaptive Gaussian SplittingCode2
CamSAM2: Segment Anything Accurately in Camouflaged VideosCode1
Visuo-Tactile Object Pose Estimation for a Multi-Finger Robot Hand with Low-Resolution In-Hand Tactile Sensing0
Beyond Object Categories: Multi-Attribute Reference Understanding for Visual Grounding0
Zero-Shot Human-Object Interaction Synthesis with Multimodal Priors0
Benchmarking Object Detectors under Real-World Distribution Shifts in Satellite ImageryCode1
Human-Object Interaction with Vision-Language Model Guided Relative Movement Dynamics0
Global-Local Tree Search in VLMs for 3D Indoor Scene GenerationCode1
CQ-DINO: Mitigating Gradient Dilution via Category Queries for Vast Vocabulary Object DetectionCode0
Online 3D Scene Reconstruction Using Neural Object Priors0
Any6D: Model-free 6D Pose Estimation of Novel Objects0
An Image-like Diffusion Method for Human-Object Interaction Detection0
Decorum: A Language-Based Approach For Style-Conditioned Synthesis of Indoor 3D Scenes0
OmnimatteZero: Training-free Real-time Omnimatte with Pre-trained Video Diffusion Models0
Shapley-Scarf Markets with Objective Indifferences0
Show:102550
← PrevPage 6 of 214Next →

No leaderboard results yet.