| IMRL: Integrating Visual, Physical, Temporal, and Geometric Representations for Enhanced Food Acquisition | Sep 18, 2024 | Imitation LearningReinforcement Learning (RL) | —Unverified | 0 |
| ScaleFlow++: Robust and Accurate Estimation of 3D Motion from Video | Sep 16, 2024 | Autonomous Drivingmotion prediction | CodeCode Available | 1 |
| Benchmarking VLMs' Reasoning About Persuasive Atypical Images | Sep 16, 2024 | BenchmarkingObject Recognition | —Unverified | 0 |
| PrimeDepth: Efficient Monocular Depth Estimation with a Stable Diffusion Preimage | Sep 13, 2024 | Depth EstimationMonocular Depth Estimation | CodeCode Available | 2 |
| AnySkin: Plug-and-play Skin Sensing for Robotic Touch | Sep 12, 2024 | Zero-shot Generalization | —Unverified | 0 |
| IndicVoices-R: Unlocking a Massive Multilingual Multi-speaker Speech Corpus for Scaling Indian TTS | Sep 9, 2024 | DenoisingSpeech Enhancement | CodeCode Available | 2 |
| TanDepth: Leveraging Global DEMs for Metric Monocular Depth Estimation in UAVs | Sep 8, 2024 | Depth EstimationMonocular Depth Estimation | —Unverified | 0 |
| Adapting Segment Anything Model to Multi-modal Salient Object Detection with Semantic Feature Fusion Guidance | Aug 27, 2024 | Decoderobject-detection | CodeCode Available | 1 |
| GR-MG: Leveraging Partially Annotated Data via Multi-Modal Goal-Conditioned Policy | Aug 26, 2024 | Few-Shot LearningImage Generation | CodeCode Available | 2 |
| Segment Anything Model for Grain Characterization in Hard Drive Design | Aug 22, 2024 | Zero-shot Generalization | —Unverified | 0 |