Zero-shot Generalization

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–100 of 572 papers

Title	Date	Tasks	Status	Hype	Score
Know Your Neighbors: Improving Single-View Reconstruction via Spatial Vision-Language Reasoning	Apr 4, 2024	3D Scene ReconstructionDepth Estimation	CodeCode Available	2	5
Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models	Sep 15, 2022	image-classificationImage Classification	CodeCode Available	2	5
Kick Back & Relax++: Scaling Beyond Ground-Truth Depth with SlowTV & CribsTV	Mar 3, 2024	Depth EstimationMonocular Depth Estimation	CodeCode Available	2	5
Selective Hourglass Mapping for Universal Image Restoration Based on Diffusion Model	Mar 17, 2024	Image RestorationZero-shot Generalization	CodeCode Available	2	5
Semantic Guidance Tuning for Text-To-Image Diffusion Models	Dec 26, 2023	Zero-shot Generalization	CodeCode Available	2	5
Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation	Dec 20, 2023	Robot ManipulationZero-shot Generalization	CodeCode Available	2	5
Segment Any Anomaly without Training via Hybrid Prompt Regularization	May 18, 2023	Anomaly DetectionAnomaly Localization	CodeCode Available	2	5
Efficient Diffusion Transformer Policies with Mixture of Expert Denoisers for Multitask Learning	Dec 17, 2024	Denoising	CodeCode Available	2	5
SAM2MOT: A Novel Paradigm of Multi-Object Tracking by Segmentation	Apr 6, 2025	Multi-Object TrackingObject	CodeCode Available	2	5
Autoregressive Image Generation with Randomized Parallel Decoding	Mar 13, 2025	Conditional Image GenerationImage Generation	CodeCode Available	2	5
Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agents	Apr 19, 2023	Information RetrievalPassage Ranking	CodeCode Available	2	5
RSBuilding: Towards General Remote Sensing Image Building Extraction and Change Detection with Foundation Model	Mar 12, 2024	Change DetectionZero-shot Generalization	CodeCode Available	2	5
vesselFM: A Foundation Model for Universal 3D Blood Vessel Segmentation	Nov 26, 2024	Image SegmentationMedical Image Analysis	CodeCode Available	2	5
PrimeDepth: Efficient Monocular Depth Estimation with a Stable Diffusion Preimage	Sep 13, 2024	Depth EstimationMonocular Depth Estimation	CodeCode Available	2	5
IndicVoices-R: Unlocking a Massive Multilingual Multi-speaker Speech Corpus for Scaling Indian TTS	Sep 9, 2024	DenoisingSpeech Enhancement	CodeCode Available	2	5
Q-Insight: Understanding Image Quality via Visual Reinforcement Learning	Mar 28, 2025	DescriptiveImage Quality Assessment	CodeCode Available	2	5
On the test-time zero-shot generalization of vision-language models: Do we really need prompt learning?	May 3, 2024	Computational EfficiencyPrompt Learning	CodeCode Available	2	5
InstructDoc: A Dataset for Zero-Shot Generalization of Visual Document Understanding with Instructions	Jan 24, 2024	document understandingQuestion Answering	CodeCode Available	2	5
OpenCity: Open Spatio-Temporal Foundation Models for Traffic Prediction	Aug 16, 2024	PredictionTraffic Prediction	CodeCode Available	2	5
Delineate Anything: Resolution-Agnostic Field Boundary Delineation on Satellite Imagery	Apr 3, 2025	Field Boundary DelineationInstance Segmentation	CodeCode Available	2	5
NeRF-Supervised Deep Stereo	Mar 30, 2023	NeRFNeural Rendering	CodeCode Available	2	5
Multitask Prompted Training Enables Zero-Shot Task Generalization	Oct 15, 2021	BenchmarkingDecoder	CodeCode Available	2	5
Next Token Is Enough: Realistic Image Quality and Aesthetic Scoring with Multimodal Large Language Model	Mar 8, 2025	Image Quality AssessmentLanguage Modeling	CodeCode Available	2	5
DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment	Jul 3, 2025	cross-modal alignmentInstruction Following	CodeCode Available	2	5
Meta-DT: Offline Meta-RL as Conditional Sequence Modeling with World Model Disentanglement	Oct 15, 2024	DisentanglementInductive Bias	CodeCode Available	2	5
Detecting Everything in the Open World: Towards Universal Object Detection	Mar 21, 2023	object-detectionObject Detection	CodeCode Available	2	5
Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient	Nov 26, 2024	GPUImage Generation	CodeCode Available	2	5
Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning	Feb 4, 2024	Contact-rich ManipulationZero-shot Generalization	CodeCode Available	2	5
Matryoshka Diffusion Models	Oct 23, 2023	Image GenerationZero-shot Generalization	CodeCode Available	2	5
Improving Zero-shot Generalization of Learned Prompts via Unsupervised Knowledge Distillation	Jul 3, 2024	Domain GeneralizationKnowledge Distillation	CodeCode Available	2	5
Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache Compression	May 26, 2025	Zero-shot Generalization	CodeCode Available	2	5
RoboUniView: Visual-Language Model with Unified View Representation for Robotic Manipulation	Jun 27, 2024	Language ModelingLanguage Modelling	CodeCode Available	2	5
No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance	Apr 4, 2024	BenchmarkingImage Generation	CodeCode Available	2	5
Crosslingual Generalization through Multitask Finetuning	Nov 3, 2022	Coreference ResolutionCross-Lingual Transfer	CodeCode Available	2	5
Efficient Alignment of Unconditioned Action Prior for Language-conditioned Pick and Place in Clutter	Mar 12, 2025	Zero-shot Generalization	CodeCode Available	2	5
EcomGPT: Instruction-tuning Large Language Models with Chain-of-Task Tasks for E-commerce	Aug 14, 2023	DiversityInstruction Following	CodeCode Available	2	5
RecGPT: A Foundation Model for Sequential Recommendation	Jun 6, 2025	Decodermodel	CodeCode Available	2	5
VIMA: General Robot Manipulation with Multimodal Prompts	Oct 6, 2022	Imitation LearningLanguage Modelling	CodeCode Available	2	5
How Well Does GPT-4V(ision) Adapt to Distribution Shifts? A Preliminary Investigation	Dec 12, 2023	Anomaly DetectionAutonomous Driving	CodeCode Available	1	5
CLIP-Forge: Towards Zero-Shot Text-to-Shape Generation	Oct 6, 2021	Image GenerationText to Image Generation	CodeCode Available	1	5
LR0.FM: Low-Res Benchmark and Improving Robustness for Zero-Shot Classification in Foundation Models	Feb 6, 2025	zero-shot-classificationZero-shot Generalization	CodeCode Available	1	5
CLIP-Embed-KD: Computationally Efficient Knowledge Distillation Using Embeddings as Teachers	Apr 9, 2024	Knowledge DistillationZero-shot Generalization	CodeCode Available	1	5
M^3GPT: An Advanced Multimodal, Multitask Framework for Motion Comprehension and Generation	May 25, 2024	Language ModelingLanguage Modelling	CodeCode Available	1	5
Gradient Ascent Post-training Enhances Language Model Generalization	Jun 12, 2023	Language ModelingLanguage Modelling	CodeCode Available	1	5
GOMAA-Geo: GOal Modality Agnostic Active Geo-localization	Jun 4, 2024	Contrastive Learninggeo-localization	CodeCode Available	1	5
Grounding Language to Entities and Dynamics for Generalization in Reinforcement Learning	Jan 19, 2021	reinforcement-learningReinforcement Learning (RL)	CodeCode Available	1	5
MAgNet: Mesh Agnostic Neural PDE Solver	Oct 11, 2022	Zero-shot Generalization	CodeCode Available	1	5
Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks	Oct 31, 2017	Machine TranslationTranslation	CodeCode Available	1	5
Generalization to New Actions in Reinforcement Learning	Nov 3, 2020	reinforcement-learningReinforcement Learning	CodeCode Available	1	5
Boosting Segment Anything Model Towards Open-Vocabulary Learning	Dec 6, 2023	modelObject	CodeCode Available	1	5

Show:10 25 50

← PrevPage 2 of 12Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GR-MG	Avg. sequence length	4.04	—	Unverified
2	MoDE	Avg. sequence length	4.01	—	Unverified
3	RoboUniView	Avg. sequence length	3.65	—	Unverified
4	3D Diffuser Actor	Avg. sequence length	3.27	—	Unverified
5	GR-1	Avg. sequence length	3.06	—	Unverified