SOTAVerified

Zero-shot Generalization

Papers

Showing 101150 of 572 papers

TitleStatusHype
Equivariant Image ModelingCode1
STOP: Integrated Spatial-Temporal Dynamic Prompting for Video UnderstandingCode1
Nature-Inspired Population-Based Evolution of Large Language ModelsCode1
Delving into Out-of-Distribution Detection with Medical Vision-Language ModelsCode1
Model Generalization on Text Attribute Graphs: Principles with Large Language ModelsCode1
LR0.FM: Low-Res Benchmark and Improving Robustness for Zero-Shot Classification in Foundation ModelsCode1
Improving Zero-Shot Object-Level Change Detection by Incorporating Visual CorrespondenceCode1
FRESA: Feedforward Reconstruction of Personalized Skinned Avatars from Few ImagesCode1
OW-OVD: Unified Open World and Open Vocabulary Object DetectionCode1
Explanatory Instructions: Towards Unified Vision Tasks Understanding and Zero-shot GeneralizationCode1
Towards Open-Vocabulary Video Semantic SegmentationCode1
SAM-Mamba: Mamba Guided SAM Architecture for Generalized Zero-Shot Polyp SegmentationCode1
Knowledge Transfer and Domain Adaptation for Fine-Grained Remote Sensing Image SegmentationCode1
COMPrompter: reconceptualized segment anything model with multiprompt network for camouflaged object detectionCode1
Instruction-Tuning Llama-3-8B Excels in City-Scale Mobility PredictionCode1
M^2PT: Multimodal Prompt Tuning for Zero-shot Instruction LearningCode1
ScaleFlow++: Robust and Accurate Estimation of 3D Motion from VideoCode1
Adapting Segment Anything Model to Multi-modal Salient Object Detection with Semantic Feature Fusion GuidanceCode1
Generalizable Facial Expression RecognitionCode1
Visual Grounding for Object-Level Generalization in Reinforcement LearningCode1
Test-Time Low Rank Adaptation via Confidence Maximization for Zero-Shot Generalization of Vision-Language ModelsCode1
ScaleFlow++: Robust and Accurate Estimation of 3D Motion from VideoCode1
Unified Embedding Alignment for Open-Vocabulary Video Instance SegmentationCode1
A Two-stage Reinforcement Learning-based Approach for Multi-entity Task AllocationCode1
GOMAA-Geo: GOal Modality Agnostic Active Geo-localizationCode1
μLO: Compute-Efficient Meta-Generalization of Learned OptimizersCode1
M^3GPT: An Advanced Multimodal, Multitask Framework for Motion Comprehension and GenerationCode1
CompilerDream: Learning a Compiler World Model for General Code OptimizationCode1
CLIP-Embed-KD: Computationally Efficient Knowledge Distillation Using Embeddings as TeachersCode1
Where to Move Next: Zero-shot Generalization of LLMs for Next POI RecommendationCode1
Just Shift It: Test-Time Prototype Shifting for Zero-Shot Generalization with Vision-Language ModelsCode1
Data-Efficient Contrastive Language-Image Pretraining: Prioritizing Data Quality over QuantityCode1
Dreaming of Many Worlds: Learning Contextual World Models Aids Zero-Shot GeneralizationCode1
FastSAM3D: An Efficient Segment Anything Model for 3D Volumetric Medical ImagesCode1
Augmenting Efficient Real-time Surgical Instrument Segmentation in Video with Point Tracking and Segment AnythingCode1
FluoroSAM: A Language-aligned Foundation Model for X-ray Image SegmentationCode1
Zero-shot Generalizable Incremental Learning for Vision-Language Object DetectionCode1
Multimodal Instruction Tuning with Conditional Mixture of LoRACode1
Multi-Task Learning for Routing Problem with Cross-Problem Zero-Shot GeneralizationCode1
Triple-Encoders: Representations That Fire Together, Wire TogetherCode1
Tag-LLM: Repurposing General-Purpose LLMs for Specialized DomainsCode1
Symbol: Generating Flexible Black-Box Optimizers through Symbolic Equation LearningCode1
Exploring the Best Practices of Query Expansion with Large Language ModelsCode1
MatSAM: Efficient Extraction of Microstructures of Materials via Visual Large ModelCode1
Pre-trained Model Guided Fine-Tuning for Zero-Shot Adversarial RobustnessCode1
How Well Does GPT-4V(ision) Adapt to Distribution Shifts? A Preliminary InvestigationCode1
Large Language Models are Good Prompt Learners for Low-Shot Image ClassificationCode1
MuRF: Multi-Baseline Radiance FieldsCode1
Boosting Segment Anything Model Towards Open-Vocabulary LearningCode1
VSCode: General Visual Salient and Camouflaged Object Detection with 2D Prompt LearningCode1
Show:102550
← PrevPage 3 of 12Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1GR-MGAvg. sequence length4.04Unverified
2MoDEAvg. sequence length4.01Unverified
3RoboUniViewAvg. sequence length3.65Unverified
43D Diffuser ActorAvg. sequence length3.27Unverified
5GR-1Avg. sequence length3.06Unverified