SOTAVerified

Zero-shot Generalization

Papers

Showing 201225 of 572 papers

TitleStatusHype
IMRL: Integrating Visual, Physical, Temporal, and Geometric Representations for Enhanced Food Acquisition0
ScaleFlow++: Robust and Accurate Estimation of 3D Motion from VideoCode1
Benchmarking VLMs' Reasoning About Persuasive Atypical Images0
PrimeDepth: Efficient Monocular Depth Estimation with a Stable Diffusion PreimageCode2
AnySkin: Plug-and-play Skin Sensing for Robotic Touch0
IndicVoices-R: Unlocking a Massive Multilingual Multi-speaker Speech Corpus for Scaling Indian TTSCode2
TanDepth: Leveraging Global DEMs for Metric Monocular Depth Estimation in UAVs0
Adapting Segment Anything Model to Multi-modal Salient Object Detection with Semantic Feature Fusion GuidanceCode1
GR-MG: Leveraging Partially Annotated Data via Multi-Modal Goal-Conditioned PolicyCode2
Segment Anything Model for Grain Characterization in Hard Drive Design0
Large Language Models as Foundations for Next-Gen Dense Retrieval: A Comprehensive Empirical Assessment0
Generalizable Facial Expression RecognitionCode1
Zero-Shot Object-Centric Representation Learning0
OpenCity: Open Spatio-Temporal Foundation Models for Traffic PredictionCode2
One Shot is Enough for Sequential Infrared Small Target SegmentationCode0
Performance and Non-adversarial Robustness of the Segment Anything Model 2 in Surgical Video Segmentation0
Visual Grounding for Object-Level Generalization in Reinforcement LearningCode1
HeteroMorpheus: Universal Control Based on Morphological Heterogeneity ModelingCode0
Segment Anything for Videos: A Systematic SurveyCode5
HybridDepth: Robust Metric Depth Fusion by Leveraging Depth from Focus and Single-Image PriorsCode2
HDL-GPT: High-Quality HDL is All You Need0
SSTD: Stripe-Like Space Target Detection Using Single-Point Weak Supervision0
Test-Time Low Rank Adaptation via Confidence Maximization for Zero-Shot Generalization of Vision-Language ModelsCode1
OpenSU3D: Open World 3D Scene Understanding using Foundation Models0
BEAF: Observing BEfore-AFter Changes to Evaluate Hallucination in Vision-language Models0
Show:102550
← PrevPage 9 of 23Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GR-MGAvg. sequence length4.04Unverified
2MoDEAvg. sequence length4.01Unverified
3RoboUniViewAvg. sequence length3.65Unverified
43D Diffuser ActorAvg. sequence length3.27Unverified
5GR-1Avg. sequence length3.06Unverified