| A vision-based autonomous UAV inspection framework for unknown tunnel construction sites with dynamic obstacles | Jan 20, 2023 | Navigate | CodeCode Available | 2 | 5 |
| Demystifying AI Platform Design for Distributed Inference of Next-Generation LLM models | Jun 3, 2024 | ChunkingMamba | CodeCode Available | 2 | 5 |
| MTGS: Multi-Traversal Gaussian Splatting | Mar 16, 2025 | NavigateNovel View Synthesis | CodeCode Available | 2 | 5 |
| VectorMapNet: End-to-end Vectorized HD Map Learning | Jun 17, 2022 | 3D Lane DetectionAutonomous Driving | CodeCode Available | 2 | 5 |
| Melting Pot 2.0 | Nov 24, 2022 | Artificial LifeNavigate | CodeCode Available | 2 | 5 |
| MuGER^2: Multi-Granularity Evidence Retrieval and Reasoning for Hybrid Question Answering | Oct 19, 2022 | NavigateQuestion Answering | CodeCode Available | 2 | 5 |
| WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents | Jul 4, 2022 | Decision MakingImitation Learning | CodeCode Available | 2 | 5 |
| LMVD: A Large-Scale Multimodal Vlog Dataset for Depression Detection in the Wild | May 9, 2024 | Depression DetectionNavigate | CodeCode Available | 2 | 5 |
| Controllable and Reliable Knowledge-Intensive Task-Oriented Conversational Agents with Declarative Genie Worksheets | Jul 8, 2024 | HallucinationNavigate | CodeCode Available | 2 | 5 |
| LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an Agent | Sep 21, 2023 | 3D visual groundingLanguage Modeling | CodeCode Available | 2 | 5 |
| NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning | Mar 12, 2024 | NavigateVision and Language Navigation | CodeCode Available | 2 | 5 |
| Sim-to-Real Transfer via 3D Feature Fields for Vision-and-Language Navigation | Jun 14, 2024 | NavigateVision and Language Navigation | CodeCode Available | 2 | 5 |
| Revisit Anything: Visual Place Recognition via Image Segment Retrieval | Sep 26, 2024 | Image SegmentationNavigate | CodeCode Available | 2 | 5 |
| Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions | Jun 27, 2024 | NavigateVision and Language Navigation | CodeCode Available | 2 | 5 |
| Holodeck: Language Guided Generation of 3D Embodied AI Environments | Dec 14, 2023 | Common Sense ReasoningLanguage Modelling | CodeCode Available | 2 | 5 |
| Joint Perception and Prediction for Autonomous Driving: A Survey | Dec 18, 2024 | Autonomous Drivingmotion prediction | CodeCode Available | 2 | 5 |
| Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory | May 25, 2023 | Common Sense ReasoningCPU | CodeCode Available | 2 | 5 |
| AlphaMaze: Enhancing Large Language Models' Spatial Intelligence via GRPO | Feb 20, 2025 | Autonomous NavigationNavigate | CodeCode Available | 2 | 5 |
| GOAT-Bench: A Benchmark for Multi-Modal Lifelong Navigation | Apr 9, 2024 | Go to AnyThingNavigate | CodeCode Available | 2 | 5 |
| Large Language Models(LLMs) on Tabular Data: Prediction, Generation, and Understanding -- A Survey | Feb 27, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 | 5 |
| FLAME: Learning to Navigate with Multimodal LLM in Urban Environments | Aug 20, 2024 | NavigateVision and Language Navigation | CodeCode Available | 2 | 5 |
| AI Research Agents for Machine Learning: Search, Exploration, and Generalization in MLE-bench | Jul 3, 2025 | Navigate | CodeCode Available | 2 | 5 |
| ForesightNav: Learning Scene Imagination for Efficient Exploration | Apr 22, 2025 | Efficient ExplorationNavigate | CodeCode Available | 2 | 5 |
| Generative Artificial Intelligence for Navigating Synthesizable Chemical Space | Oct 4, 2024 | Drug DiscoveryNavigate | CodeCode Available | 2 | 5 |
| Dynam3D: Dynamic Layered 3D Tokens Empower VLM for Vision-and-Language Navigation | May 16, 2025 | 3D geometryNavigate | CodeCode Available | 2 | 5 |
| GUI Odyssey: A Comprehensive Dataset for Cross-App GUI Navigation on Mobile Devices | Jun 12, 2024 | Navigate | CodeCode Available | 2 | 5 |
| Event-based Stereo Depth Estimation: A Survey | Sep 26, 2024 | Depth EstimationNavigate | CodeCode Available | 2 | 5 |
| Imagine Before Go: Self-Supervised Generative Map for Object Goal Navigation | Jan 1, 2024 | General KnowledgeNavigate | CodeCode Available | 2 | 5 |
| ETPNav: Evolving Topological Planning for Vision-Language Navigation in Continuous Environments | Apr 6, 2023 | Autonomous NavigationNavigate | CodeCode Available | 2 | 5 |
| DriVLMe: Enhancing LLM-based Autonomous Driving Agents with Embodied and Social Experiences | Jun 5, 2024 | Autonomous DrivingAutonomous Vehicles | CodeCode Available | 2 | 5 |
| DualMap: Online Open-Vocabulary Semantic Mapping for Natural Language Navigation in Dynamic Changing Scenes | Jun 2, 2025 | Natural Language QueriesNavigate | CodeCode Available | 2 | 5 |
| Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive Survey | Nov 21, 2023 | Navigate | CodeCode Available | 2 | 5 |
| Lookahead Exploration with Neural Radiance Representation for Continuous Vision-Language Navigation | Apr 2, 2024 | NavigateVision and Language Navigation | CodeCode Available | 2 | 5 |
| MAGE: A Multi-Agent Engine for Automated RTL Code Generation | Dec 10, 2024 | Code GenerationNavigate | CodeCode Available | 2 | 5 |
| AdaVLN: Towards Visual Language Navigation in Continuous Indoor Environments with Moving Humans | Nov 27, 2024 | Navigate | CodeCode Available | 2 | 5 |
| ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level Code | Nov 16, 2023 | Code GenerationNavigate | CodeCode Available | 2 | 5 |
| Enhance Then Search: An Augmentation-Search Strategy with Foundation Models for Cross-Domain Few-Shot Object Detection | Apr 6, 2025 | Cross-Domain Few-ShotCross-Domain Few-Shot Object Detection | CodeCode Available | 2 | 5 |
| From Cognition to Precognition: A Future-Aware Framework for Social Navigation | Sep 20, 2024 | Future predictionNavigate | CodeCode Available | 2 | 5 |
| AerialVLN: Vision-and-Language Navigation for UAVs | Aug 13, 2023 | cross-modal alignmentNavigate | CodeCode Available | 2 | 5 |
| OpenFMNav: Towards Open-Set Zero-Shot Object Navigation via Vision-Language Foundation Models | Feb 16, 2024 | Common Sense ReasoningNavigate | CodeCode Available | 2 | 5 |
| Learning Efficient and Effective Trajectories for Differential Equation-based Image Restoration | Oct 7, 2024 | Image RestorationNavigate | CodeCode Available | 2 | 5 |
| Panoptic nuScenes: A Large-Scale Benchmark for LiDAR Panoptic Segmentation and Tracking | Sep 8, 2021 | BenchmarkingDiversity | CodeCode Available | 2 | 5 |
| PLAYER*: Enhancing LLM-based Multi-Agent Communication and Interaction in Murder Mystery Games | Apr 26, 2024 | Decision MakingLanguage Modeling | CodeCode Available | 2 | 5 |
| Adaptive Risk-Tendency: Nano Drone Navigation in Cluttered Environments with Distributional Reinforcement Learning | Mar 28, 2022 | Distributional Reinforcement LearningDrone navigation | CodeCode Available | 1 | 5 |
| Can GPT-4 Perform Neural Architecture Search? | Apr 21, 2023 | NavigateNeural Architecture Search | CodeCode Available | 1 | 5 |
| Differentiable Agent-based Epidemiology | Jul 20, 2022 | EpidemiologyNavigate | CodeCode Available | 1 | 5 |
| DISCO: Embodied Navigation and Interaction via Differentiable Scene Semantics and Dual-level Control | Jul 20, 2024 | Instruction FollowingNavigate | CodeCode Available | 1 | 5 |
| BioImage.IO Chatbot: A Community-Driven AI Assistant for Integrative Computational Bioimaging | Oct 23, 2023 | ChatbotInformation Retrieval | CodeCode Available | 1 | 5 |
| Can Large Language Models be Good Path Planners? A Benchmark and Investigation on Spatial-temporal Reasoning | Oct 5, 2023 | NavigateSpatial Reasoning | CodeCode Available | 1 | 5 |
| DFR-FastMOT: Detection Failure Resistant Tracker for Fast Multi-Object Tracking Based on Sensor Fusion | Feb 28, 2023 | Autonomous VehiclesMulti-Object Tracking | CodeCode Available | 1 | 5 |