| Equivariant Image Modeling | Mar 24, 2025 | Image GenerationZero-shot Generalization | CodeCode Available | 1 |
| STOP: Integrated Spatial-Temporal Dynamic Prompting for Video Understanding | Mar 20, 2025 | Video UnderstandingZero-shot Generalization | CodeCode Available | 1 |
| Nature-Inspired Population-Based Evolution of Large Language Models | Mar 3, 2025 | GPUZero-shot Generalization | CodeCode Available | 1 |
| Delving into Out-of-Distribution Detection with Medical Vision-Language Models | Mar 2, 2025 | Benchmarkingimage-classification | CodeCode Available | 1 |
| Model Generalization on Text Attribute Graphs: Principles with Large Language Models | Feb 17, 2025 | AttributeGraph Learning | CodeCode Available | 1 |
| LR0.FM: Low-Res Benchmark and Improving Robustness for Zero-Shot Classification in Foundation Models | Feb 6, 2025 | zero-shot-classificationZero-shot Generalization | CodeCode Available | 1 |
| Improving Zero-Shot Object-Level Change Detection by Incorporating Visual Correspondence | Jan 9, 2025 | Change DetectionZero-shot Generalization | CodeCode Available | 1 |
| FRESA: Feedforward Reconstruction of Personalized Skinned Avatars from Few Images | Jan 1, 2025 | 3D CanonicalizationZero-shot Generalization | CodeCode Available | 1 |
| OW-OVD: Unified Open World and Open Vocabulary Object Detection | Jan 1, 2025 | AttributeIncremental Learning | CodeCode Available | 1 |
| Explanatory Instructions: Towards Unified Vision Tasks Understanding and Zero-shot Generalization | Dec 24, 2024 | Image SegmentationSemantic Segmentation | CodeCode Available | 1 |
| Towards Open-Vocabulary Video Semantic Segmentation | Dec 12, 2024 | SegmentationSemantic Segmentation | CodeCode Available | 1 |
| SAM-Mamba: Mamba Guided SAM Architecture for Generalized Zero-Shot Polyp Segmentation | Dec 11, 2024 | MambaSegmentation | CodeCode Available | 1 |
| Knowledge Transfer and Domain Adaptation for Fine-Grained Remote Sensing Image Segmentation | Dec 9, 2024 | Domain AdaptationImage Segmentation | CodeCode Available | 1 |
| COMPrompter: reconceptualized segment anything model with multiprompt network for camouflaged object detection | Nov 28, 2024 | object-detectionObject Detection | CodeCode Available | 1 |
| Instruction-Tuning Llama-3-8B Excels in City-Scale Mobility Prediction | Oct 31, 2024 | Disaster ResponseLanguage Modeling | CodeCode Available | 1 |
| M^2PT: Multimodal Prompt Tuning for Zero-shot Instruction Learning | Sep 24, 2024 | Zero-shot Generalization | CodeCode Available | 1 |
| ScaleFlow++: Robust and Accurate Estimation of 3D Motion from Video | Sep 16, 2024 | Autonomous Drivingmotion prediction | CodeCode Available | 1 |
| Adapting Segment Anything Model to Multi-modal Salient Object Detection with Semantic Feature Fusion Guidance | Aug 27, 2024 | Decoderobject-detection | CodeCode Available | 1 |
| Generalizable Facial Expression Recognition | Aug 20, 2024 | Domain AdaptationFacial Expression Recognition | CodeCode Available | 1 |
| Visual Grounding for Object-Level Generalization in Reinforcement Learning | Aug 4, 2024 | Language ModellingObject | CodeCode Available | 1 |
| Test-Time Low Rank Adaptation via Confidence Maximization for Zero-Shot Generalization of Vision-Language Models | Jul 22, 2024 | Zero-shot Generalization | CodeCode Available | 1 |
| ScaleFlow++: Robust and Accurate Estimation of 3D Motion from Video | Jul 13, 2024 | Autonomous DrivingMotion Estimation | CodeCode Available | 1 |
| Unified Embedding Alignment for Open-Vocabulary Video Instance Segmentation | Jul 10, 2024 | Instance SegmentationSemantic Segmentation | CodeCode Available | 1 |
| A Two-stage Reinforcement Learning-based Approach for Multi-entity Task Allocation | Jun 29, 2024 | Combinatorial Optimizationreinforcement-learning | CodeCode Available | 1 |
| GOMAA-Geo: GOal Modality Agnostic Active Geo-localization | Jun 4, 2024 | Contrastive Learninggeo-localization | CodeCode Available | 1 |