SOTAVerified

Zero-shot Generalization

Papers

Showing 351400 of 572 papers

TitleStatusHype
Learning to navigate by distilling visual information and natural language instructions0
Learning to Represent State with Perceptual Schemata0
Leveraging Jumpy Models for Planning and Fast Learning in Robotic Domains0
LeVERB: Humanoid Whole-Body Control with Latent Vision-Language Instruction0
Light Field Diffusion for Single-View Novel View Synthesis0
LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias0
Marigold-DC: Zero-Shot Monocular Depth Completion with Guided Diffusion0
MASP: Scalable GNN-based Planning for Multi-Agent Navigation0
Matching options to tasks using Option-Indexed Hierarchical Reinforcement Learning0
F^2Depth: Self-supervised Indoor Monocular Depth Estimation via Optical Flow Consistency and Feature Map Synthesis0
SAM^Med: A medical image annotation framework based on large vision model0
Mechanistic Understandings of Representation Vulnerabilities and Engineering Robust Vision Transformers0
MIFNet: Learning Modality-Invariant Features for Generalizable Multimodal Image Matching0
Mixture of Cluster-conditional LoRA Experts for Vision-language Instruction Tuning0
MmAP : Multi-modal Alignment Prompt for Cross-domain Multi-task Learning0
Mono2Stereo: Monocular Knowledge Transfer for Enhanced Stereo Matching0
Multiple Consistency-guided Test-Time Adaptation for Contrastive Audio-Language Models with Unlabeled Audio0
Multi-View Unsupervised Image Generation with Cross Attention Guidance0
Neural Attention Memory0
Neural Field Dynamics Model for Granular Object Piles Manipulation0
NeuralSCF: Neural network self-consistent fields for density functional theory0
NVSPolicy: Adaptive Novel-View Synthesis for Generalizable Language-Conditioned Policy Learning0
On the Evaluation of Generative Robotic Simulations0
On the Out-Of-Distribution Generalization of Multimodal Large Language Models0
On the Out-Of-Distribution Generalization of Large Multimodal Models0
On the Performance of Multimodal Language Models0
On the Use of Linguistic Features for the Evaluation of Generative Dialogue Systems0
On the Zero-shot Adversarial Robustness of Vision-Language Models: A Truly Zero-shot and Training-free Approach0
On the Zero-Shot Generalization of Machine-Generated Text Detectors0
OpenSU3D: Open World 3D Scene Understanding using Foundation Models0
ORQA: A Benchmark and Foundation Model for Holistic Operating Room Modeling0
OTTER: A Vision-Language-Action Model with Text-Aware Visual Feature Extraction0
Performance and Non-adversarial Robustness of the Segment Anything Model 2 in Surgical Video Segmentation0
PhD Thesis: Exploring the role of (self-)attention in cognitive and computer vision architecture0
PoE: a Panel of Experts for Generalized Automatic Dialogue Assessment0
PoseLess: Depth-Free Vision-to-Joint Control via Direct Image Mapping with VLM0
From Pixels to Predicates: Learning Symbolic World Models via Pretrained Vision-Language Models0
Pro2SAM: Mask Prompt to SAM with Grid Points for Weakly Supervised Object Localization0
Program Guided Agent0
Prompt-based Visual Alignment for Zero-shot Policy Transfer0
PromptSync: Bridging Domain Gaps in Vision-Language Models through Class-Aware Prototype Alignment and Discrimination0
RACA: Relation-Aware Credit Assignment for Ad-Hoc Cooperation in Multi-Agent Deep Reinforcement Learning0
RAILGUN: A Unified Convolutional Policy for Multi-Agent Path Finding Across Different Environments and Tasks0
RD-GAN: Few/Zero-Shot Chinese Character Style Transfer via Radical Decomposition and Rendering0
Real-Time Anomaly Detection and Reactive Planning with Large Language Models0
Reinforcement Learning of Implicit and Explicit Control Flow in Instructions0
Revisiting the Robust Generalization of Adversarial Prompt Tuning0
RNG-KBQA: Generation Augmented Iterative Ranking for Knowledge Base Question Answering0
Robotic Programmer: Video Instructed Policy Code Generation for Robotic Manipulation0
Robot Skill Generalization via Keypoint Integrated Soft Actor-Critic Gaussian Mixture Models0
Show:102550
← PrevPage 8 of 12Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GR-MGAvg. sequence length4.04Unverified
2MoDEAvg. sequence length4.01Unverified
3RoboUniViewAvg. sequence length3.65Unverified
43D Diffuser ActorAvg. sequence length3.27Unverified
5GR-1Avg. sequence length3.06Unverified