| Disentangling Representations through Multi-task Learning | Jul 15, 2024 | Decision MakingMulti-Task Learning | —Unverified | 0 |
| ScaleFlow++: Robust and Accurate Estimation of 3D Motion from Video | Jul 13, 2024 | Autonomous DrivingMotion Estimation | CodeCode Available | 1 |
| Adaptive Prediction Ensemble: Improving Out-of-Distribution Generalization of Motion Forecasting | Jul 12, 2024 | Autonomous DrivingDeep Learning | —Unverified | 0 |
| Real-Time Anomaly Detection and Reactive Planning with Large Language Models | Jul 11, 2024 | Anomaly DetectionAutonomous Vehicles | —Unverified | 0 |
| Enhancing Robustness of Vision-Language Models through Orthogonality Learning and Self-Regularization | Jul 11, 2024 | Data AugmentationDomain Generalization | —Unverified | 0 |
| Swiss DINO: Efficient and Versatile Vision Framework for On-device Personal Object Search | Jul 10, 2024 | Few-Shot LearningGPU | CodeCode Available | 0 |
| Unified Embedding Alignment for Open-Vocabulary Video Instance Segmentation | Jul 10, 2024 | Instance SegmentationSemantic Segmentation | CodeCode Available | 1 |
| Improving Zero-shot Generalization of Learned Prompts via Unsupervised Knowledge Distillation | Jul 3, 2024 | Domain GeneralizationKnowledge Distillation | CodeCode Available | 2 |
| Cross-Modal Attention Alignment Network with Auxiliary Text Description for zero-shot sketch-based image retrieval | Jul 1, 2024 | cross-modal alignmentImage Retrieval | —Unverified | 0 |
| A Two-stage Reinforcement Learning-based Approach for Multi-entity Task Allocation | Jun 29, 2024 | Combinatorial Optimizationreinforcement-learning | CodeCode Available | 1 |
| RoboUniView: Visual-Language Model with Unified View Representation for Robotic Manipulation | Jun 27, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 |
| NeuralSCF: Neural network self-consistent fields for density functional theory | Jun 22, 2024 | Zero-shot Generalization | —Unverified | 0 |
| GeoBench: Benchmarking and Analyzing Monocular Geometry Estimation Models | Jun 18, 2024 | BenchmarkingDepth Estimation | CodeCode Available | 2 |
| Words in Motion: Extracting Interpretable Control Vectors for Motion Transformers | Jun 17, 2024 | Motion ForecastingZero-shot Generalization | —Unverified | 0 |
| Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity | Jun 17, 2024 | Continual LearningZero-shot Generalization | CodeCode Available | 0 |
| RobustSAM: Segment Anything Robustly on Degraded Images | Jun 13, 2024 | DeblurringImage Dehazing | CodeCode Available | 3 |
| Deep Exploration of Cross-Lingual Zero-Shot Generalization in Instruction Tuning | Jun 13, 2024 | Zero-shot Generalization | CodeCode Available | 0 |
| Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models | Jun 5, 2024 | Few-Shot LearningLanguage Modeling | CodeCode Available | 2 |
| Prompt-based Visual Alignment for Zero-shot Policy Transfer | Jun 5, 2024 | Autonomous DrivingLanguage Modelling | —Unverified | 0 |
| GOMAA-Geo: GOal Modality Agnostic Active Geo-localization | Jun 4, 2024 | Contrastive Learninggeo-localization | CodeCode Available | 1 |
| OLIVE: Object Level In-Context Visual Embeddings | Jun 2, 2024 | ObjectZero-shot Generalization | CodeCode Available | 0 |
| μLO: Compute-Efficient Meta-Generalization of Learned Optimizers | May 31, 2024 | GPUZero-shot Generalization | CodeCode Available | 1 |
| Text-only Synthesis for Image Captioning | May 28, 2024 | Image CaptioningLanguage Modelling | —Unverified | 0 |
| TIMA: Text-Image Mutual Awareness for Balancing Zero-Shot Adversarial Robustness and Generalization Ability | May 27, 2024 | Adversarial RobustnessKnowledge Distillation | —Unverified | 0 |
| Benchmarking General-Purpose In-Context Learning | May 27, 2024 | BenchmarkingDecision Making | —Unverified | 0 |