| Do Pedestrians Pay Attention? Eye Contact Detection in the Wild | Dec 8, 2021 | Autonomous VehiclesContact Detection | CodeCode Available | 1 |
| FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimization | Dec 2, 2021 | counterfactualImage Generation | CodeCode Available | 1 |
| Learning to automate cryo-electron microscopy data collection with Ptolemy | Dec 1, 2021 | Cryogenic Electron Microscopy (cryo-EM)Navigate | CodeCode Available | 1 |
| Landmark-RxR: Solving Vision-and-Language Navigation with Fine-Grained Alignment Supervision | Dec 1, 2021 | cross-modal alignmentNavigate | CodeCode Available | 1 |
| Catch Me If You Hear Me: Audio-Visual Navigation in Complex Unmapped Environments with Moving Sounds | Nov 29, 2021 | NavigateVisual Navigation | CodeCode Available | 1 |
| Simple but Effective: CLIP Embeddings for Embodied AI | Nov 18, 2021 | Image ManipulationNavigate | CodeCode Available | 1 |
| Multimodal Transformer with Variable-length Memory for Vision-and-Language Navigation | Nov 10, 2021 | DecoderNavigate | CodeCode Available | 1 |
| History Aware Multimodal Transformer for Vision-and-Language Navigation | Oct 25, 2021 | Decision MakingNavigate | CodeCode Available | 1 |
| No RL, No Simulation: Learning to Navigate without Navigating | Oct 18, 2021 | NavigateReinforcement Learning (RL) | CodeCode Available | 1 |
| SGoLAM: Simultaneous Goal Localization and Mapping for Multi-Object Goal Navigation | Oct 14, 2021 | NavigateVisual Navigation | CodeCode Available | 1 |