| Matryoshka Diffusion Models | Oct 23, 2023 | Image GenerationZero-shot Generalization | CodeCode Available | 2 |
| EcomGPT: Instruction-tuning Large Language Models with Chain-of-Task Tasks for E-commerce | Aug 14, 2023 | DiversityInstruction Following | CodeCode Available | 2 |
| 2nd Place Winning Solution for the CVPR2023 Visual Anomaly and Novelty Detection Challenge: Multimodal Prompting for Data-centric Anomaly Detection | Jun 15, 2023 | Anomaly DetectionAnomaly Localization | CodeCode Available | 2 |
| Segment Any Anomaly without Training via Hybrid Prompt Regularization | May 18, 2023 | Anomaly DetectionAnomaly Localization | CodeCode Available | 2 |
| LLM+P: Empowering Large Language Models with Optimal Planning Proficiency | Apr 22, 2023 | Zero-shot Generalization | CodeCode Available | 2 |
| Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agents | Apr 19, 2023 | Information RetrievalPassage Ranking | CodeCode Available | 2 |
| NeRF-Supervised Deep Stereo | Mar 30, 2023 | NeRFNeural Rendering | CodeCode Available | 2 |
| Detecting Everything in the Open World: Towards Universal Object Detection | Mar 21, 2023 | object-detectionObject Detection | CodeCode Available | 2 |
| Crosslingual Generalization through Multitask Finetuning | Nov 3, 2022 | Coreference ResolutionCross-Lingual Transfer | CodeCode Available | 2 |
| VIMA: General Robot Manipulation with Multimodal Prompts | Oct 6, 2022 | Imitation LearningLanguage Modelling | CodeCode Available | 2 |
| Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models | Sep 15, 2022 | image-classificationImage Classification | CodeCode Available | 2 |
| BigBIO: A Framework for Data-Centric Biomedical Natural Language Processing | Jun 30, 2022 | DiversityLanguage Model Evaluation | CodeCode Available | 2 |
| Multitask Prompted Training Enables Zero-Shot Task Generalization | Oct 15, 2021 | BenchmarkingDecoder | CodeCode Available | 2 |
| IRanker: Towards Ranking Foundation Model | Jun 25, 2025 | GSM8Kmodel | CodeCode Available | 1 |
| OWMM-Agent: Open World Mobile Manipulation With Multi-modal Agentic Data Synthesis | Jun 4, 2025 | Action GenerationDecision Making | CodeCode Available | 1 |
| Beyond the LUMIR challenge: The pathway to foundational registration models | May 30, 2025 | Image RegistrationZero-shot Generalization | CodeCode Available | 1 |
| DrVD-Bench: Do Vision-Language Models Reason Like Human Doctors in Medical Image Diagnosis? | May 30, 2025 | DiagnosticMedical Image Analysis | CodeCode Available | 1 |
| ReasonPlan: Unified Scene Prediction and Decision Reasoning for Closed-loop Autonomous Driving | May 26, 2025 | Autonomous DrivingBench2Drive | CodeCode Available | 1 |
| Universal Biological Sequence Reranking for Improved De Novo Peptide Sequencing | May 23, 2025 | de novo peptide sequencingReranking | CodeCode Available | 1 |
| Foundation Models Knowledge Distillation For Battery Capacity Degradation Forecast | May 13, 2025 | Knowledge DistillationTime Series | CodeCode Available | 1 |
| Benchmarking Vision, Language, & Action Models in Procedurally Generated, Open Ended Action Environments | May 8, 2025 | BenchmarkingPrompt Engineering | CodeCode Available | 1 |
| Towards Ball Spin and Trajectory Analysis in Table Tennis Broadcast Videos via Physically Grounded Synthetic-to-Real Transfer | Apr 28, 2025 | Monocular 3D Object LocalizationSports Analytics | CodeCode Available | 1 |
| Crane: Context-Guided Prompt Learning and Attention Refinement for Zero-Shot Anomaly Detections | Apr 15, 2025 | Anomaly DetectionAnomaly Localization | CodeCode Available | 1 |
| PicoPose: Progressive Pixel-to-Pixel Correspondence Learning for Novel Object Pose Estimation | Apr 3, 2025 | ObjectPose Estimation | CodeCode Available | 1 |
| FRESA:Feedforward Reconstruction of Personalized Skinned Avatars from Few Images | Mar 24, 2025 | 3D CanonicalizationZero-shot Generalization | CodeCode Available | 1 |