| Equivariant Image Modeling | Mar 24, 2025 | Image GenerationZero-shot Generalization | CodeCode Available | 1 |
| STOP: Integrated Spatial-Temporal Dynamic Prompting for Video Understanding | Mar 20, 2025 | Video UnderstandingZero-shot Generalization | CodeCode Available | 1 |
| Nature-Inspired Population-Based Evolution of Large Language Models | Mar 3, 2025 | GPUZero-shot Generalization | CodeCode Available | 1 |
| Delving into Out-of-Distribution Detection with Medical Vision-Language Models | Mar 2, 2025 | Benchmarkingimage-classification | CodeCode Available | 1 |
| Model Generalization on Text Attribute Graphs: Principles with Large Language Models | Feb 17, 2025 | AttributeGraph Learning | CodeCode Available | 1 |
| LR0.FM: Low-Res Benchmark and Improving Robustness for Zero-Shot Classification in Foundation Models | Feb 6, 2025 | zero-shot-classificationZero-shot Generalization | CodeCode Available | 1 |
| Improving Zero-Shot Object-Level Change Detection by Incorporating Visual Correspondence | Jan 9, 2025 | Change DetectionZero-shot Generalization | CodeCode Available | 1 |
| FRESA: Feedforward Reconstruction of Personalized Skinned Avatars from Few Images | Jan 1, 2025 | 3D CanonicalizationZero-shot Generalization | CodeCode Available | 1 |
| OW-OVD: Unified Open World and Open Vocabulary Object Detection | Jan 1, 2025 | AttributeIncremental Learning | CodeCode Available | 1 |
| Explanatory Instructions: Towards Unified Vision Tasks Understanding and Zero-shot Generalization | Dec 24, 2024 | Image SegmentationSemantic Segmentation | CodeCode Available | 1 |
| Towards Open-Vocabulary Video Semantic Segmentation | Dec 12, 2024 | SegmentationSemantic Segmentation | CodeCode Available | 1 |
| SAM-Mamba: Mamba Guided SAM Architecture for Generalized Zero-Shot Polyp Segmentation | Dec 11, 2024 | MambaSegmentation | CodeCode Available | 1 |
| Knowledge Transfer and Domain Adaptation for Fine-Grained Remote Sensing Image Segmentation | Dec 9, 2024 | Domain AdaptationImage Segmentation | CodeCode Available | 1 |
| COMPrompter: reconceptualized segment anything model with multiprompt network for camouflaged object detection | Nov 28, 2024 | object-detectionObject Detection | CodeCode Available | 1 |
| Instruction-Tuning Llama-3-8B Excels in City-Scale Mobility Prediction | Oct 31, 2024 | Disaster ResponseLanguage Modeling | CodeCode Available | 1 |
| M^2PT: Multimodal Prompt Tuning for Zero-shot Instruction Learning | Sep 24, 2024 | Zero-shot Generalization | CodeCode Available | 1 |
| ScaleFlow++: Robust and Accurate Estimation of 3D Motion from Video | Sep 16, 2024 | Autonomous Drivingmotion prediction | CodeCode Available | 1 |
| Adapting Segment Anything Model to Multi-modal Salient Object Detection with Semantic Feature Fusion Guidance | Aug 27, 2024 | Decoderobject-detection | CodeCode Available | 1 |
| Generalizable Facial Expression Recognition | Aug 20, 2024 | Domain AdaptationFacial Expression Recognition | CodeCode Available | 1 |
| Visual Grounding for Object-Level Generalization in Reinforcement Learning | Aug 4, 2024 | Language ModellingObject | CodeCode Available | 1 |
| Test-Time Low Rank Adaptation via Confidence Maximization for Zero-Shot Generalization of Vision-Language Models | Jul 22, 2024 | Zero-shot Generalization | CodeCode Available | 1 |
| ScaleFlow++: Robust and Accurate Estimation of 3D Motion from Video | Jul 13, 2024 | Autonomous DrivingMotion Estimation | CodeCode Available | 1 |
| Unified Embedding Alignment for Open-Vocabulary Video Instance Segmentation | Jul 10, 2024 | Instance SegmentationSemantic Segmentation | CodeCode Available | 1 |
| A Two-stage Reinforcement Learning-based Approach for Multi-entity Task Allocation | Jun 29, 2024 | Combinatorial Optimizationreinforcement-learning | CodeCode Available | 1 |
| GOMAA-Geo: GOal Modality Agnostic Active Geo-localization | Jun 4, 2024 | Contrastive Learninggeo-localization | CodeCode Available | 1 |
| μLO: Compute-Efficient Meta-Generalization of Learned Optimizers | May 31, 2024 | GPUZero-shot Generalization | CodeCode Available | 1 |
| M^3GPT: An Advanced Multimodal, Multitask Framework for Motion Comprehension and Generation | May 25, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| CompilerDream: Learning a Compiler World Model for General Code Optimization | Apr 24, 2024 | DiversityModel-based Reinforcement Learning | CodeCode Available | 1 |
| CLIP-Embed-KD: Computationally Efficient Knowledge Distillation Using Embeddings as Teachers | Apr 9, 2024 | Knowledge DistillationZero-shot Generalization | CodeCode Available | 1 |
| Where to Move Next: Zero-shot Generalization of LLMs for Next POI Recommendation | Apr 2, 2024 | Zero-shot Generalization | CodeCode Available | 1 |
| Just Shift It: Test-Time Prototype Shifting for Zero-Shot Generalization with Vision-Language Models | Mar 19, 2024 | image-classificationImage Classification | CodeCode Available | 1 |
| Data-Efficient Contrastive Language-Image Pretraining: Prioritizing Data Quality over Quantity | Mar 18, 2024 | Zero-shot Generalization | CodeCode Available | 1 |
| Dreaming of Many Worlds: Learning Contextual World Models Aids Zero-Shot Generalization | Mar 16, 2024 | Zero-shot Generalization | CodeCode Available | 1 |
| FastSAM3D: An Efficient Segment Anything Model for 3D Volumetric Medical Images | Mar 14, 2024 | 3D Medical Imaging SegmentationGPU | CodeCode Available | 1 |
| Augmenting Efficient Real-time Surgical Instrument Segmentation in Video with Point Tracking and Segment Anything | Mar 12, 2024 | GPUPoint Tracking | CodeCode Available | 1 |
| FluoroSAM: A Language-aligned Foundation Model for X-ray Image Segmentation | Mar 12, 2024 | DiagnosticImage Segmentation | CodeCode Available | 1 |
| Zero-shot Generalizable Incremental Learning for Vision-Language Object Detection | Mar 4, 2024 | Incremental Learningobject-detection | CodeCode Available | 1 |
| Multimodal Instruction Tuning with Conditional Mixture of LoRA | Feb 24, 2024 | parameter-efficient fine-tuningZero-shot Generalization | CodeCode Available | 1 |
| Multi-Task Learning for Routing Problem with Cross-Problem Zero-Shot Generalization | Feb 23, 2024 | AttributeCombinatorial Optimization | CodeCode Available | 1 |
| Triple-Encoders: Representations That Fire Together, Wire Together | Feb 19, 2024 | Contrastive LearningRepresentation Learning | CodeCode Available | 1 |
| Tag-LLM: Repurposing General-Purpose LLMs for Specialized Domains | Feb 6, 2024 | TAGZero-shot Generalization | CodeCode Available | 1 |
| Symbol: Generating Flexible Black-Box Optimizers through Symbolic Equation Learning | Feb 4, 2024 | Meta-LearningZero-shot Generalization | CodeCode Available | 1 |
| Exploring the Best Practices of Query Expansion with Large Language Models | Jan 12, 2024 | Information RetrievalRe-Ranking | CodeCode Available | 1 |
| MatSAM: Efficient Extraction of Microstructures of Materials via Visual Large Model | Jan 11, 2024 | Image SegmentationPrompt Engineering | CodeCode Available | 1 |
| Pre-trained Model Guided Fine-Tuning for Zero-Shot Adversarial Robustness | Jan 9, 2024 | Adversarial RobustnessZero-shot Generalization | CodeCode Available | 1 |
| How Well Does GPT-4V(ision) Adapt to Distribution Shifts? A Preliminary Investigation | Dec 12, 2023 | Anomaly DetectionAutonomous Driving | CodeCode Available | 1 |
| Large Language Models are Good Prompt Learners for Low-Shot Image Classification | Dec 7, 2023 | ClassificationFew-Shot Image Classification | CodeCode Available | 1 |
| MuRF: Multi-Baseline Radiance Fields | Dec 7, 2023 | NeRFZero-shot Generalization | CodeCode Available | 1 |
| Boosting Segment Anything Model Towards Open-Vocabulary Learning | Dec 6, 2023 | modelObject | CodeCode Available | 1 |
| VSCode: General Visual Salient and Camouflaged Object Detection with 2D Prompt Learning | Nov 25, 2023 | DecoderModel Optimization | CodeCode Available | 1 |