Zero-shot Generalization

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–50 of 572 papers

Title	Date	Tasks	Status	Hype	Score
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction	Apr 3, 2024	Image GenerationImage Reconstruction	CodeCode Available	9	5
Visually Descriptive Language Model for Vector Graphics Reasoning	Apr 9, 2024	DescriptiveLanguage Modeling	CodeCode Available	9	5
Metric3Dv2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation	Mar 22, 2024	Depth EstimationSurface Normal Estimation	CodeCode Available	7	5
Large Concept Models: Language Modeling in a Sentence Representation Space	Dec 11, 2024	Language ModelingLanguage Modelling	CodeCode Available	7	5
FoundationStereo: Zero-Shot Stereo Matching	Jan 17, 2025	Depth EstimationDiversity	CodeCode Available	7	5
Marigold: Affordable Adaptation of Diffusion-Based Image Generators for Image Analysis	May 14, 2025	DenoisingDepth Estimation	CodeCode Available	7	5
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs	Jul 31, 2023	Trajectory PlanningZero-shot Generalization	CodeCode Available	5	5
ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth	Feb 23, 2023	Depth EstimationMonocular Depth Estimation	CodeCode Available	5	5
RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation	Oct 10, 2024	Zero-shot Generalization	CodeCode Available	5	5
Segment Anything for Videos: A Systematic Survey	Jul 31, 2024	Image SegmentationRobot Manipulation Generalization	CodeCode Available	5	5
MonSter: Marry Monodepth to Stereo Unleashes Power	Jan 15, 2025	Depth EstimationMonocular Depth Estimation	CodeCode Available	4	5
Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction	Sep 26, 2024	3D ReconstructionDenoising	CodeCode Available	4	5
Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement	Mar 9, 2025	Domain GeneralizationObject Detection	CodeCode Available	4	5
Zero-Shot Whole-Body Humanoid Control via Behavioral Foundation Models	Apr 15, 2025	Humanoid ControlReinforcement Learning (RL)	CodeCode Available	4	5
Parameter-Efficient Prompt Tuning Makes Generalized and Calibrated Neural Text Retrievers	Jul 14, 2022	RetrievalText Retrieval	CodeCode Available	4	5
Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation	Dec 4, 2023	Depth EstimationGPU	CodeCode Available	4	5
Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image	Jul 20, 2023	Depth EstimationImage Reconstruction	CodeCode Available	4	5
Zero-1-to-3: Zero-shot One Image to 3D Object	Mar 20, 2023	3D ReconstructionImage to 3D	CodeCode Available	4	5
DEFOM-Stereo: Depth Foundation Model Based Stereo Matching	Jan 16, 2025	Depth EstimationDisparity Estimation	CodeCode Available	3	5
MVMoE: Multi-Task Vehicle Routing Solver with Mixture-of-Experts	May 2, 2024	Combinatorial OptimizationMixture-of-Experts	CodeCode Available	3	5
IEPile: Unearthing Large-Scale Schema-Based Information Extraction Corpus	Feb 22, 2024	Zero-shot Generalization	CodeCode Available	3	5
Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting	Oct 12, 2023	DecoderProbabilistic Time Series Forecasting	CodeCode Available	3	5
What Language Model to Train if You Have One Million GPU Hours?	Oct 27, 2022	GPULanguage Modeling	CodeCode Available	3	5
ZIM: Zero-Shot Image Matting for Anything	Nov 1, 2024	Image InpaintingImage Matting	CodeCode Available	3	5
Detect Anything 3D in the Wild	Apr 10, 2025	3D Object DetectionAutonomous Driving	CodeCode Available	3	5
3D Diffuser Actor: Policy Diffusion with 3D Scene Representations	Feb 18, 2024	DenoisingRobot Manipulation	CodeCode Available	3	5
Stereo Anywhere: Robust Zero-Shot Deep Stereo Matching Even Where Either Stereo or Mono Fail	Dec 5, 2024	Stereo MatchingZero-shot Generalization	CodeCode Available	3	5
CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up	Dec 20, 2024	8kGPU	CodeCode Available	3	5
Depth Any Camera: Zero-Shot Metric Depth Estimation from Any Camera	Jan 5, 2025	Data AugmentationDepth Estimation	CodeCode Available	3	5
Expanding Language-Image Pretrained Models for General Video Recognition	Aug 4, 2022	Action ClassificationAction Recognition	CodeCode Available	3	5
Separate Anything You Describe	Aug 9, 2023	Audio Source SeparationNatural Language Queries	CodeCode Available	3	5
General Object Foundation Model for Images and Videos at Scale	Dec 14, 2023	Instance SegmentationLong-tail Video Object Segmentation	CodeCode Available	3	5
Objaverse-XL: A Universe of 10M+ 3D Objects	Jul 11, 2023	DiversityNovel View Synthesis	CodeCode Available	3	5
PE3R: Perception-Efficient 3D Reconstruction	Mar 10, 2025	3D ReconstructionZero-shot Generalization	CodeCode Available	3	5
SMART: Scalable Multi-agent Real-time Motion Generation via Next-token Prediction	May 24, 2024	Autonomous DrivingMotion Generation	CodeCode Available	3	5
RobustSAM: Segment Anything Robustly on Degraded Images	Jun 13, 2024	DeblurringImage Dehazing	CodeCode Available	3	5
NeRF-Supervised Deep Stereo	Mar 30, 2023	NeRFNeural Rendering	CodeCode Available	2	5
Exploring the Limits of Vision-Language-Action Manipulations in Cross-task Generalization	May 21, 2025	Vision-Language-ActionZero-shot Generalization	CodeCode Available	2	5
Delineate Anything: Resolution-Agnostic Field Boundary Delineation on Satellite Imagery	Apr 3, 2025	Field Boundary DelineationInstance Segmentation	CodeCode Available	2	5
Next Token Is Enough: Realistic Image Quality and Aesthetic Scoring with Multimodal Large Language Model	Mar 8, 2025	Image Quality AssessmentLanguage Modeling	CodeCode Available	2	5
Crosslingual Generalization through Multitask Finetuning	Nov 3, 2022	Coreference ResolutionCross-Lingual Transfer	CodeCode Available	2	5
Meta-DT: Offline Meta-RL as Conditional Sequence Modeling with World Model Disentanglement	Oct 15, 2024	DisentanglementInductive Bias	CodeCode Available	2	5
Efficient Diffusion Transformer Policies with Mixture of Expert Denoisers for Multitask Learning	Dec 17, 2024	Denoising	CodeCode Available	2	5
Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache Compression	May 26, 2025	Zero-shot Generalization	CodeCode Available	2	5
Multitask Prompted Training Enables Zero-Shot Task Generalization	Oct 15, 2021	BenchmarkingDecoder	CodeCode Available	2	5
No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance	Apr 4, 2024	BenchmarkingImage Generation	CodeCode Available	2	5
Learning to Route Among Specialized Experts for Zero-Shot Generalization	Feb 8, 2024	parameter-efficient fine-tuningZero-shot Generalization	CodeCode Available	2	5
BigBIO: A Framework for Data-Centric Biomedical Natural Language Processing	Jun 30, 2022	DiversityLanguage Model Evaluation	CodeCode Available	2	5
BiGR: Harnessing Binary Latent Codes for Image Generation and Improved Visual Representation Capabilities	Oct 18, 2024	Conditional Image GenerationImage Generation	CodeCode Available	2	5
LLM+P: Empowering Large Language Models with Optimal Planning Proficiency	Apr 22, 2023	Zero-shot Generalization	CodeCode Available	2	5

Show:10 25 50

← PrevPage 1 of 12Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GR-MG	Avg. sequence length	4.04	—	Unverified
2	MoDE	Avg. sequence length	4.01	—	Unverified
3	RoboUniView	Avg. sequence length	3.65	—	Unverified
4	3D Diffuser Actor	Avg. sequence length	3.27	—	Unverified
5	GR-1	Avg. sequence length	3.06	—	Unverified