Zero-shot Generalization

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 1–25 of 572 papers

Title	Date	Tasks	Status	Hype
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction	Apr 3, 2024	Image GenerationImage Reconstruction	CodeCode Available	9
Visually Descriptive Language Model for Vector Graphics Reasoning	Apr 9, 2024	DescriptiveLanguage Modeling	CodeCode Available	9
Metric3Dv2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation	Mar 22, 2024	Depth EstimationSurface Normal Estimation	CodeCode Available	7
Marigold: Affordable Adaptation of Diffusion-Based Image Generators for Image Analysis	May 14, 2025	DenoisingDepth Estimation	CodeCode Available	7
Large Concept Models: Language Modeling in a Sentence Representation Space	Dec 11, 2024	Language ModelingLanguage Modelling	CodeCode Available	7
FoundationStereo: Zero-Shot Stereo Matching	Jan 17, 2025	Depth EstimationDiversity	CodeCode Available	7
ZoeDepth: Zero-shot Transfer by Combining Relative and Metric Depth	Feb 23, 2023	Depth EstimationMonocular Depth Estimation	CodeCode Available	5
Segment Anything for Videos: A Systematic Survey	Jul 31, 2024	Image SegmentationRobot Manipulation Generalization	CodeCode Available	5
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs	Jul 31, 2023	Trajectory PlanningZero-shot Generalization	CodeCode Available	5
RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation	Oct 10, 2024	Zero-shot Generalization	CodeCode Available	5
Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation	Dec 4, 2023	Depth EstimationGPU	CodeCode Available	4
MonSter: Marry Monodepth to Stereo Unleashes Power	Jan 15, 2025	Depth EstimationMonocular Depth Estimation	CodeCode Available	4
Zero-Shot Whole-Body Humanoid Control via Behavioral Foundation Models	Apr 15, 2025	Humanoid ControlReinforcement Learning (RL)	CodeCode Available	4
Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image	Jul 20, 2023	Depth EstimationImage Reconstruction	CodeCode Available	4
Zero-1-to-3: Zero-shot One Image to 3D Object	Mar 20, 2023	3D ReconstructionImage to 3D	CodeCode Available	4
Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement	Mar 9, 2025	Domain GeneralizationObject Detection	CodeCode Available	4
Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction	Sep 26, 2024	3D ReconstructionDenoising	CodeCode Available	4
Parameter-Efficient Prompt Tuning Makes Generalized and Calibrated Neural Text Retrievers	Jul 14, 2022	RetrievalText Retrieval	CodeCode Available	4
RobustSAM: Segment Anything Robustly on Degraded Images	Jun 13, 2024	DeblurringImage Dehazing	CodeCode Available	3
CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up	Dec 20, 2024	8kGPU	CodeCode Available	3
General Object Foundation Model for Images and Videos at Scale	Dec 14, 2023	Instance SegmentationLong-tail Video Object Segmentation	CodeCode Available	3
PE3R: Perception-Efficient 3D Reconstruction	Mar 10, 2025	3D ReconstructionZero-shot Generalization	CodeCode Available	3
MVMoE: Multi-Task Vehicle Routing Solver with Mixture-of-Experts	May 2, 2024	Combinatorial OptimizationMixture-of-Experts	CodeCode Available	3
Expanding Language-Image Pretrained Models for General Video Recognition	Aug 4, 2022	Action ClassificationAction Recognition	CodeCode Available	3
Detect Anything 3D in the Wild	Apr 10, 2025	3D Object DetectionAutonomous Driving	CodeCode Available	3

Show:10 25 50

← PrevPage 1 of 23Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GR-MG	Avg. sequence length	4.04	—	Unverified
2	MoDE	Avg. sequence length	4.01	—	Unverified
3	RoboUniView	Avg. sequence length	3.65	—	Unverified
4	3D Diffuser Actor	Avg. sequence length	3.27	—	Unverified
5	GR-1	Avg. sequence length	3.06	—	Unverified