SOTAVerified

Zero-shot Generalization

Papers

Showing 76100 of 572 papers

TitleStatusHype
Jasmine: Harnessing Diffusion Prior for Self-supervised Depth Estimation0
Enhancing Zero-Shot Image Recognition in Vision-Language Models through Human-like Concept Guidance0
STOP: Integrated Spatial-Temporal Dynamic Prompting for Video UnderstandingCode1
GenM^3: Generative Pretrained Multi-path Motion Model for Text Conditional Human Motion Generation0
Learning with Expert Abstractions for Efficient Multi-Task Continuous ControlCode0
Good Actions Succeed, Bad Actions Generalize: A Case Study on Why RL Generalizes Better0
Foundation Feature-Driven Online End-Effector Pose Estimation: A Marker-Free and Learning-Free Approach0
Compound Expression Recognition via Large Vision-Language Models0
Autoregressive Image Generation with Randomized Parallel DecodingCode2
Efficient Alignment of Unconditioned Action Prior for Language-conditioned Pick and Place in ClutterCode2
Prompt-OT: An Optimal Transport Regularization Paradigm for Knowledge Preservation in Vision-Language Model AdaptationCode0
A Recipe for Improving Remote Sensing VLM Zero Shot Generalization0
PE3R: Perception-Efficient 3D ReconstructionCode3
PoseLess: Depth-Free Vision-to-Joint Control via Direct Image Mapping with VLM0
Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive ReinforcementCode4
Next Token Is Enough: Realistic Image Quality and Aesthetic Scoring with Multimodal Large Language ModelCode2
OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction0
RAILGUN: A Unified Convolutional Policy for Multi-Agent Path Finding Across Different Environments and Tasks0
Nature-Inspired Population-Based Evolution of Large Language ModelsCode1
Re-Imagining Multimodal Instruction Tuning: A Representation ViewCode0
Delving into Out-of-Distribution Detection with Medical Vision-Language ModelsCode1
Contrastive Learning of English Language and Crystal Graphs for Multimodal Representation of Materials Knowledge0
Learning from Reward-Free Offline Data: A Case for Planning with Latent Dynamics Models0
GeLLMO: Generalizing Large Language Models for Multi-property Molecule OptimizationCode0
WRT-SAM: Foundation Model-Driven Segmentation for Generalized Weld Radiographic Testing0
Show:102550
← PrevPage 4 of 23Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GR-MGAvg. sequence length4.04Unverified
2MoDEAvg. sequence length4.01Unverified
3RoboUniViewAvg. sequence length3.65Unverified
43D Diffuser ActorAvg. sequence length3.27Unverified
5GR-1Avg. sequence length3.06Unverified