SOTAVerified

Object

Replace the cat with a British Shorthair cat of the breed with bulging yellow eyes

Papers

Showing 401425 of 10696 papers

TitleStatusHype
Visual-RFT: Visual Reinforcement Fine-TuningCode7
Convex Hull-based Algebraic Constraint for Visual Quadric SLAMCode1
Language-Guided Object Search in Agricultural Environments0
AI-Driven Relocation Tracking in Dynamic Kitchen EnvironmentsCode0
Modeling Fine-Grained Hand-Object Dynamics for Egocentric Video Representation LearningCode1
Taming Large Multimodal Agents for Ultra-low Bitrate Semantically Disentangled Image CompressionCode0
EigenActor: Variant Body-Object Interaction Generation Evolved from Invariant Action Basis Reasoning0
Dynamic Markov Blanket Detection for Macroscopic Physics DiscoveryCode1
Towards Semantic 3D Hand-Object Interaction Generation via Functional Text Guidance0
Mitigating Hallucinations in Large Vision-Language Models by Adaptively Constraining Information FlowCode1
Enhancing deep neural networks through complex-valued representations and Kuramoto synchronization dynamics0
Vector-Quantized Vision Foundation Models for Object-Centric LearningCode1
Vision-Encoders (Already) Know What They See: Mitigating Object Hallucination via Simple Fine-Grained CLIPScoreCode0
BEVDiffuser: Plug-and-Play Diffusion Model for BEV Denoising with Ground-Truth Guidance0
QORT-Former: Query-optimized Real-time Transformer for Understanding Two Hands Manipulating Objects0
CLIP Under the Microscope: A Fine-Grained Analysis of Multi-Object RepresentationCode1
Analyzing CLIP's Performance Limitations in Multi-Object Scenarios: A Controlled High-Resolution Study0
MITracker: Multi-View Integration for Visual Object Tracking0
C-Drag: Chain-of-Thought Driven Motion Controller for Video GenerationCode1
InterMimic: Towards Universal Whole-Body Control for Physics-Based Human-Object InteractionsCode3
CoopDETR: A Unified Cooperative Perception Framework for 3D Detection via Object Query0
ObjectVLA: End-to-End Open-World Object Manipulation Without Demonstration0
Spectral-Enhanced Transformers: Leveraging Large-Scale Pretrained Models for Hyperspectral Object Tracking0
Ev-3DOD: Pushing the Temporal Boundaries of 3D Object Detection with Event CamerasCode1
Dictionary-based Framework for Interpretable and Consistent Object Parsing0
Show:102550
← PrevPage 17 of 428Next →

No leaderboard results yet.