SOTAVerified

Object

Replace the cat with a British Shorthair cat of the breed with bulging yellow eyes

Papers

Showing 19511975 of 10696 papers

TitleStatusHype
MuLan: Multimodal-LLM Agent for Progressive and Interactive Multi-Object DiffusionCode1
DINOBot: Robot Manipulation via Retrieval and Alignment with Vision Foundation Models0
Slot-VLM: SlowFast Slots for Video-Language Modeling0
Efficient Parameter Mining and Freezing for Continual Object Detection0
TEXT2AFFORD: Probing Object Affordance Prediction abilities of Language Models solely from TextCode0
Object-level Geometric Structure Preserving for Natural Image StitchingCode1
OLViT: Multi-Modal State Tracking via Attention-Based Embeddings for Video-Grounded Dialog0
GOOD: Towards Domain Generalized Orientated Object Detection0
Visual Reasoning in Object-Centric Deep Neural Networks: A Comparative Cognition ApproachCode0
Open3DSG: Open-Vocabulary 3D Scene Graphs from Point Clouds with Queryable Objects and Open-Set RelationshipsCode2
UncertaintyTrack: Exploiting Detection and Localization Uncertainty in Multi-Object TrackingCode1
Logical Closed Loop: Uncovering Object Hallucinations in Large Vision-Language ModelsCode1
CoLLaVO: Crayon Large Language and Vision mOdelCode2
GaussianObject: High-Quality 3D Object Reconstruction from Four Views with Gaussian SplattingCode5
Lester: rotoscope animation through video object segmentation and trackingCode1
Detecting Anomalous Events in Object-centric Business Processes via Graph Neural NetworksCode0
Moving Object Proposals with Deep Learned Optical Flow for Video Object Segmentation0
Few-Shot Object Detection with Sparse Context Transformers0
Leveraging Self-Supervised Instance Contrastive Learning for Radar Object Detection0
H2O-SDF: Two-phase Learning for 3D Indoor Reconstruction using Object Surface FieldsCode0
Unsupervised Discovery of Object-Centric Neural Fields0
Exploring Perceptual Limitation of Multimodal Large Language ModelsCode1
GBOT: Graph-Based 3D Object Tracking for Augmented Reality-Assisted Assembly GuidanceCode1
Semantic Object-level Modeling for Robust Visual Camera Relocalization0
Event-to-Video Conversion for Overhead Object Detection0
Show:102550
← PrevPage 79 of 428Next →

No leaderboard results yet.