| Decision Transformer as a Foundation Model for Partially Observable Continuous Control | Apr 3, 2024 | continuous-controlContinuous Control | —Unverified | 0 |
| F^2Depth: Self-supervised Indoor Monocular Depth Estimation via Optical Flow Consistency and Feature Map Synthesis | Mar 27, 2024 | Depth EstimationIndoor Monocular Depth Estimation | —Unverified | 0 |
| Federated reinforcement learning for robot motion planning with zero-shot generalization | Mar 20, 2024 | Motion PlanningZero-shot Generalization | —Unverified | 0 |
| Quantifying uncertainty in lung cancer segmentation with foundation models applied to mixed-domain datasets | Mar 19, 2024 | Computed Tomography (CT)Segmentation | —Unverified | 0 |
| Temporal-spatial Adaptation of Promptable SAM Enhance Accuracy and Generalizability of cine CMR Segmentation | Mar 15, 2024 | Myocardium SegmentationSegmentation | —Unverified | 0 |
| SAM-Lightening: A Lightweight Segment Anything Model with Dilated Flash Attention to Achieve 30 times Acceleration | Mar 14, 2024 | Transfer LearningZero-shot Generalization | —Unverified | 0 |
| Select and Distill: Selective Dual-Teacher Knowledge Transfer for Continual Learning on Vision-Language Models | Mar 14, 2024 | Continual LearningKnowledge Distillation | —Unverified | 0 |
| In-context Prompt Learning for Test-time Vision Recognition with Frozen Vision-language Model | Mar 10, 2024 | In-Context LearningLanguage Modeling | —Unverified | 0 |
| SAM-PD: How Far Can SAM Take Us in Tracking and Segmenting Anything in Videos by Prompt Denoising | Mar 7, 2024 | DenoisingInstance Segmentation | CodeCode Available | 0 |
| Segment anything model for head and neck tumor segmentation with CT, PET and MRI multi-modality images | Feb 27, 2024 | SegmentationTumor Segmentation | CodeCode Available | 0 |
| ARL2: Aligning Retrievers for Black-box Large Language Models via Self-guided Adaptive Relevance Labeling | Feb 21, 2024 | MMLURetrieval | CodeCode Available | 0 |
| Zero-shot generalization across architectures for visual classification | Feb 21, 2024 | ClassificationZero-shot Generalization | CodeCode Available | 0 |
| ISCUTE: Instance Segmentation of Cables Using Text Embedding | Feb 19, 2024 | Instance SegmentationObject Recognition | —Unverified | 0 |
| From Real World to Logic and Back: Learning Generalizable Relational Concepts For Long Horizon Robot Planning | Feb 19, 2024 | Motion PlanningTask and Motion Planning | —Unverified | 0 |
| Unsupervised Discovery of Object-Centric Neural Fields | Feb 12, 2024 | ObjectObject Discovery | —Unverified | 0 |
| On the Out-Of-Distribution Generalization of Multimodal Large Language Models | Feb 9, 2024 | In-Context LearningOut-of-Distribution Generalization | —Unverified | 0 |
| Distilling Morphology-Conditioned Hypernetworks for Efficient Universal Morphology Control | Feb 9, 2024 | Zero-shot Generalization | CodeCode Available | 0 |
| InCoRo: In-Context Learning for Robotics Control with Feedback Loops | Feb 7, 2024 | In-Context LearningScene Understanding | —Unverified | 0 |
| Image-Caption Encoding for Improving Zero-Shot Generalization | Feb 5, 2024 | image-classificationImage Classification | CodeCode Available | 0 |
| Convolution Meets LoRA: Parameter Efficient Finetuning for Segment Anything Model | Jan 31, 2024 | Image Segmentationparameter-efficient fine-tuning | —Unverified | 0 |
| Data-Free Generalized Zero-Shot Learning | Jan 28, 2024 | Generalized Zero-Shot Learningzero-shot-classification | CodeCode Available | 0 |
| Solving Continual Offline Reinforcement Learning with Decision Transformer | Jan 16, 2024 | Offline RLreinforcement-learning | —Unverified | 0 |
| An Experimental Design Framework for Label-Efficient Supervised Finetuning of Large Language Models | Jan 12, 2024 | Active LearningDiversity | —Unverified | 0 |
| TimeGraphs: Graph-based Temporal Reasoning | Jan 6, 2024 | Zero-shot Generalization | —Unverified | 0 |
| ConfusionPrompt: Practical Private Inference for Online Large Language Models | Dec 30, 2023 | Privacy PreservingZero-shot Generalization | —Unverified | 0 |
| A Dual Curriculum Learning Framework for Multi-UAV Pursuit-Evasion in Diverse Environments | Dec 19, 2023 | Reinforcement Learning (RL)Zero-shot Generalization | —Unverified | 0 |
| Mixture of Cluster-conditional LoRA Experts for Vision-language Instruction Tuning | Dec 19, 2023 | DiversityInstruction Following | —Unverified | 0 |
| Towards the Unification of Generative and Discriminative Visual Foundation Model: A Survey | Dec 15, 2023 | Image GenerationImage Segmentation | —Unverified | 0 |
| MmAP : Multi-modal Alignment Prompt for Cross-domain Multi-task Learning | Dec 14, 2023 | DecoderLanguage Modelling | —Unverified | 0 |
| Adaptive Human Trajectory Prediction via Latent Corridors | Dec 11, 2023 | PredictionTrajectory Prediction | —Unverified | 0 |
| Multi-View Unsupervised Image Generation with Cross Attention Guidance | Dec 7, 2023 | Hard AttentionImage Generation | —Unverified | 0 |
| MASP: Scalable GNN-based Planning for Multi-Agent Navigation | Dec 5, 2023 | Reinforcement Learning (RL)Zero-shot Generalization | —Unverified | 0 |
| I-PHYRE: Interactive Physical Reasoning | Dec 4, 2023 | Zero-shot Generalization | —Unverified | 0 |
| Categorical Traffic Transformer: Interpretable and Diverse Behavior Prediction with Tokenized Latent | Nov 30, 2023 | Autonomous VehiclesCommon Sense Reasoning | —Unverified | 0 |
| Large Model Based Referring Camouflaged Object Detection | Nov 28, 2023 | modelObject | —Unverified | 0 |
| UniIR: Training and Benchmarking Universal Multimodal Information Retrievers | Nov 28, 2023 | BenchmarkingInformation Retrieval | —Unverified | 0 |
| C-SAW: Self-Supervised Prompt Learning for Image Generalization in Remote Sensing | Nov 27, 2023 | Language ModellingPrompt Learning | —Unverified | 0 |
| A Safer Vision-based Autonomous Planning System for Quadrotor UAVs with Dynamic Obstacle Trajectory Prediction and Its Application with LLMs | Nov 21, 2023 | object-detectionObject Detection | —Unverified | 0 |
| Towards Generalizable SER: Soft Labeling and Data Augmentation for Modeling Temporal Emotion Shifts in Large-Scale Multilingual Speech | Nov 15, 2023 | Contrastive LearningCross-corpus | CodeCode Available | 0 |
| Improving Zero-shot Visual Question Answering via Large Language Models with Reasoning Question Prompts | Nov 15, 2023 | Question AnsweringSentence | CodeCode Available | 0 |
| Adaptive recurrent vision performs zero-shot computation scaling to unseen difficulty levels | Nov 12, 2023 | PathfinderVisual Reasoning | —Unverified | 0 |
| A Simple yet Efficient Ensemble Approach for AI-generated Text Detection | Nov 6, 2023 | Language ModellingLarge Language Model | —Unverified | 0 |
| Octavius: Mitigating Task Interference in MLLMs via LoRA-MoE | Nov 5, 2023 | DecoderMixture-of-Experts | CodeCode Available | 0 |
| Align Your Prompts: Test-Time Prompting with Distribution Alignment for Zero-Shot Generalization | Nov 2, 2023 | Domain GeneralizationPrompt Learning | —Unverified | 0 |
| Neural Field Dynamics Model for Granular Object Piles Manipulation | Nov 1, 2023 | ObjectZero-shot Generalization | —Unverified | 0 |
| ZGUL: Zero-shot Generalization to Unseen Languages using Multi-source Ensembling of Language Adapters | Oct 25, 2023 | Cross-Lingual TransferLanguage Modelling | CodeCode Available | 0 |
| Robot Skill Generalization via Keypoint Integrated Soft Actor-Critic Gaussian Mixture Models | Oct 23, 2023 | Skill GeneralizationZero-shot Generalization | —Unverified | 0 |
| InstructRetro: Instruction Tuning post Retrieval-Augmented Pretraining | Oct 11, 2023 | 4kDecoder | —Unverified | 0 |
| What Matters to You? Towards Visual Representation Alignment for Robot Learning | Oct 11, 2023 | Zero-shot Generalization | —Unverified | 0 |
| From Supervised to Generative: A Novel Paradigm for Tabular Deep Learning with Large Language Models | Oct 11, 2023 | In-Context LearningInstruction Following | CodeCode Available | 0 |