| EndoUIC: Promptable Diffusion Transformer for Unified Illumination Correction in Capsule Endoscopy | Jun 19, 2024 | Exposure CorrectionImage Enhancement | CodeCode Available | 1 |
| KERM: Knowledge Enhanced Reasoning for Vision-and-Language Navigation | Mar 28, 2023 | NavigateVision and Language Navigation | CodeCode Available | 1 |
| AutoTrans: Automating Transformer Design via Reinforced Architecture Search | Sep 4, 2020 | Natural Language UnderstandingNavigate | CodeCode Available | 1 |
| Landmark-RxR: Solving Vision-and-Language Navigation with Fine-Grained Alignment Supervision | Dec 1, 2021 | cross-modal alignmentNavigate | CodeCode Available | 1 |
| StoryGPT-V: Large Language Models as Consistent Story Visualizers | Dec 4, 2023 | Language ModellingLarge Language Model | CodeCode Available | 1 |
| Large Language Models Can Self-Improve At Web Agent Tasks | May 30, 2024 | Navigate | CodeCode Available | 1 |
| General Evaluation for Instruction Conditioned Navigation using Dynamic Time Warping | Jul 11, 2019 | Dynamic Time WarpingNavigate | CodeCode Available | 1 |
| Evaluating Language Models for Mathematics through Interactions | Jun 2, 2023 | Language ModellingMathematical Problem-Solving | CodeCode Available | 1 |
| A Study on Learning Social Robot Navigation with Multimodal Perception | Sep 22, 2023 | Decision MakingNavigate | CodeCode Available | 1 |
| FootstepNet: an Efficient Actor-Critic Method for Fast On-line Bipedal Footstep Planning and Forecasting | Mar 19, 2024 | Deep Reinforcement LearningNavigate | CodeCode Available | 1 |
| Learning from Unlabeled 3D Environments for Vision-and-Language Navigation | Aug 24, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 1 |
| Learning to automate cryo-electron microscopy data collection with Ptolemy | Dec 1, 2021 | Cryogenic Electron Microscopy (cryo-EM)Navigate | CodeCode Available | 1 |
| Learning to Move with Affordance Maps | Jan 8, 2020 | Autonomous NavigationAutonomous Vehicles | CodeCode Available | 1 |
| Learning to Navigate in Synthetically Accessible Chemical Space Using Reinforcement Learning | Jan 1, 2020 | Drug DiscoveryNavigate | CodeCode Available | 1 |
| DISCO: Embodied Navigation and Interaction via Differentiable Scene Semantics and Dual-level Control | Jul 20, 2024 | Instruction FollowingNavigate | CodeCode Available | 1 |
| Advances in 3D Neural Stylization: A Survey | Nov 30, 2023 | NavigateNeural Stylization | CodeCode Available | 1 |
| Do Pedestrians Pay Attention? Eye Contact Detection in the Wild | Dec 8, 2021 | Autonomous VehiclesContact Detection | CodeCode Available | 1 |
| Differentiable Agent-based Epidemiology | Jul 20, 2022 | EpidemiologyNavigate | CodeCode Available | 1 |
| AidUI: Toward Automated Recognition of Dark Patterns in User Interfaces | Mar 12, 2023 | Navigate | CodeCode Available | 1 |
| DPMPC-Planner: A real-time UAV trajectory planning framework for complex static environments with dynamic obstacles | Sep 14, 2021 | Model Predictive ControlNavigate | CodeCode Available | 1 |
| AI-IMU Dead-Reckoning | Apr 12, 2019 | Dead-Reckoning PredictionNavigate | CodeCode Available | 1 |
| DFR-FastMOT: Detection Failure Resistant Tracker for Fast Multi-Object Tracking Based on Sensor Fusion | Feb 28, 2023 | Autonomous VehiclesMulti-Object Tracking | CodeCode Available | 1 |
| AVP-SLAM: Semantic Visual Mapping and Localization for Autonomous Vehicles in the Parking Lot | Jul 3, 2020 | Autonomous VehiclesNavigate | CodeCode Available | 1 |
| BAA-NGP: Bundle-Adjusting Accelerated Neural Graphics Primitives | Jun 7, 2023 | 3D Scene ReconstructionNavigate | CodeCode Available | 1 |
| No Free Lunch in LLM Watermarking: Trade-offs in Watermarking Design Choices | Feb 25, 2024 | Navigate | CodeCode Available | 1 |
| Map-based Modular Approach for Zero-shot Embodied Question Answering | May 26, 2024 | Embodied Question AnsweringNavigate | CodeCode Available | 1 |
| Mask4D: End-to-End Mask-Based 4D Panoptic Segmentation for LiDAR Sequences | Sep 18, 2023 | 3D Panoptic Segmentation4D Panoptic Segmentation | CodeCode Available | 1 |
| SoundSpaces: Audio-Visual Navigation in 3D Environments | Dec 24, 2019 | Deep Reinforcement LearningNavigate | CodeCode Available | 1 |
| Airbert: In-domain Pretraining for Vision-and-Language Navigation | Aug 20, 2021 | NavigateReferring Expression | CodeCode Available | 1 |
| Demo Abstract: Real-Time Out-of-Distribution Detection on a Mobile Robot | Nov 15, 2022 | Navigateobject-detection | CodeCode Available | 1 |
| Demystifying Map Space Exploration for NPUs | Oct 7, 2022 | NavigateNeural Architecture Search | CodeCode Available | 1 |
| DialFRED: Dialogue-Enabled Agents for Embodied Instruction Following | Feb 27, 2022 | Instruction FollowingNavigate | CodeCode Available | 1 |
| Deep Reinforcement Learning-Based Mapless Crowd Navigation with Perceived Risk of the Moving Crowd for Mobile Robots | Apr 7, 2023 | Deep Reinforcement LearningNavigate | CodeCode Available | 1 |
| Factorizing Perception and Policy for Interactive Instruction Following | Dec 6, 2020 | Instruction FollowingNavigate | CodeCode Available | 1 |
| MUGEN: A Playground for Video-Audio-Text Multimodal Understanding and GENeration | Apr 17, 2022 | NavigateRetrieval | CodeCode Available | 1 |
| DeepPilot: A CNN for Autonomous Drone Racing | Aug 13, 2020 | Autonomous NavigationNavigate | CodeCode Available | 1 |
| Multi-Class Segmentation from Aerial Views using Recursive Noise Diffusion | Dec 1, 2022 | DenoisingNavigate | CodeCode Available | 1 |
| Multimodal Transformer with Variable-length Memory for Vision-and-Language Navigation | Nov 10, 2021 | DecoderNavigate | CodeCode Available | 1 |
| Deep Reinforcement learning for real autonomous mobile robot navigation in indoor environments | May 28, 2020 | continuous-controlContinuous Control | CodeCode Available | 1 |
| Navigating Data Heterogeneity in Federated Learning A Semi-Supervised Federated Object Detection | Oct 26, 2023 | Autonomous DrivingFederated Learning | CodeCode Available | 1 |
| Decentralized Motion Planning for Multi-Robot Navigation using Deep Reinforcement Learning | Nov 11, 2020 | Deep Reinforcement LearningMotion Planning | CodeCode Available | 1 |
| DD-PPO: Learning Near-Perfect PointGoal Navigators from 2.5 Billion Frames | Nov 1, 2019 | Autonomous NavigationGPU | CodeCode Available | 1 |
| Neighbor-view Enhanced Model for Vision and Language Navigation | Jul 15, 2021 | NavigateVision and Language Navigation | CodeCode Available | 1 |
| Neural Brain: A Neuroscience-inspired Framework for Embodied Agents | May 12, 2025 | Navigate | CodeCode Available | 1 |
| No RL, No Simulation: Learning to Navigate without Navigating | Oct 18, 2021 | NavigateReinforcement Learning (RL) | CodeCode Available | 1 |
| Goal Misgeneralization in Deep Reinforcement Learning | May 28, 2021 | Deep Reinforcement LearningNavigate | CodeCode Available | 1 |
| Decentralized Social Navigation with Non-Cooperative Robots via Bi-Level Optimization | Jun 15, 2023 | Collision AvoidanceMulti-agent Reinforcement Learning | CodeCode Available | 1 |
| Autonomous and cooperative design of the monitor positions for a team of UAVs to maximize the quantity and quality of detected objects | Jul 2, 2020 | Navigate | CodeCode Available | 1 |
| One-Shot Informed Robotic Visual Search in the Wild | Mar 22, 2020 | NavigateRepresentation Learning | CodeCode Available | 1 |
| A Multiplicative Value Function for Safe and Efficient Reinforcement Learning | Mar 7, 2023 | Navigatereinforcement-learning | CodeCode Available | 1 |