Zero-shot Generalization

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 26–50 of 572 papers

Title	Date	Tasks	Status	Hype
RobustSAM: Segment Anything Robustly on Degraded Images	Jun 13, 2024	DeblurringImage Dehazing	CodeCode Available	3
SMART: Scalable Multi-agent Real-time Motion Generation via Next-token Prediction	May 24, 2024	Autonomous DrivingMotion Generation	CodeCode Available	3
MVMoE: Multi-Task Vehicle Routing Solver with Mixture-of-Experts	May 2, 2024	Combinatorial OptimizationMixture-of-Experts	CodeCode Available	3
IEPile: Unearthing Large-Scale Schema-Based Information Extraction Corpus	Feb 22, 2024	Zero-shot Generalization	CodeCode Available	3
3D Diffuser Actor: Policy Diffusion with 3D Scene Representations	Feb 18, 2024	DenoisingRobot Manipulation	CodeCode Available	3
General Object Foundation Model for Images and Videos at Scale	Dec 14, 2023	Instance SegmentationLong-tail Video Object Segmentation	CodeCode Available	3
Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting	Oct 12, 2023	DecoderProbabilistic Time Series Forecasting	CodeCode Available	3
Separate Anything You Describe	Aug 9, 2023	Audio Source SeparationNatural Language Queries	CodeCode Available	3
Objaverse-XL: A Universe of 10M+ 3D Objects	Jul 11, 2023	DiversityNovel View Synthesis	CodeCode Available	3
What Language Model to Train if You Have One Million GPU Hours?	Oct 27, 2022	GPULanguage Modeling	CodeCode Available	3
Expanding Language-Image Pretrained Models for General Video Recognition	Aug 4, 2022	Action ClassificationAction Recognition	CodeCode Available	3
DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment	Jul 3, 2025	cross-modal alignmentInstruction Following	CodeCode Available	2
WAFT: Warping-Alone Field Transforms for Optical Flow	Jun 26, 2025	Optical Flow EstimationZero-shot Generalization	CodeCode Available	2
RecGPT: A Foundation Model for Sequential Recommendation	Jun 6, 2025	Decodermodel	CodeCode Available	2
Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache Compression	May 26, 2025	Zero-shot Generalization	CodeCode Available	2
Exploring the Limits of Vision-Language-Action Manipulations in Cross-task Generalization	May 21, 2025	Vision-Language-ActionZero-shot Generalization	CodeCode Available	2
SAM2MOT: A Novel Paradigm of Multi-Object Tracking by Segmentation	Apr 6, 2025	Multi-Object TrackingObject	CodeCode Available	2
Delineate Anything: Resolution-Agnostic Field Boundary Delineation on Satellite Imagery	Apr 3, 2025	Field Boundary DelineationInstance Segmentation	CodeCode Available	2
Q-Insight: Understanding Image Quality via Visual Reinforcement Learning	Mar 28, 2025	DescriptiveImage Quality Assessment	CodeCode Available	2
Bokehlicious: Photorealistic Bokeh Rendering with Controllable Apertures	Mar 20, 2025	DeblurringZero-shot Generalization	CodeCode Available	2
Autoregressive Image Generation with Randomized Parallel Decoding	Mar 13, 2025	Conditional Image GenerationImage Generation	CodeCode Available	2
Efficient Alignment of Unconditioned Action Prior for Language-conditioned Pick and Place in Clutter	Mar 12, 2025	Zero-shot Generalization	CodeCode Available	2
Next Token Is Enough: Realistic Image Quality and Aesthetic Scoring with Multimodal Large Language Model	Mar 8, 2025	Image Quality AssessmentLanguage Modeling	CodeCode Available	2
Efficient Diffusion Transformer Policies with Mixture of Expert Denoisers for Multitask Learning	Dec 17, 2024	Denoising	CodeCode Available	2
Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient	Nov 26, 2024	GPUImage Generation	CodeCode Available	2

Show:10 25 50

← PrevPage 2 of 23Next →

Benchmark Results

#	Model	Metric	Claimed	Verified	Status
1	GR-MG	Avg. sequence length	4.04	—	Unverified
2	MoDE	Avg. sequence length	4.01	—	Unverified
3	RoboUniView	Avg. sequence length	3.65	—	Unverified
4	3D Diffuser Actor	Avg. sequence length	3.27	—	Unverified
5	GR-1	Avg. sequence length	3.06	—	Unverified