SOTAVerified

Navigate

Papers

Showing 51100 of 1982 papers

TitleStatusHype
A vision-based autonomous UAV inspection framework for unknown tunnel construction sites with dynamic obstaclesCode2
Demystifying AI Platform Design for Distributed Inference of Next-Generation LLM modelsCode2
MTGS: Multi-Traversal Gaussian SplattingCode2
VectorMapNet: End-to-end Vectorized HD Map LearningCode2
Melting Pot 2.0Code2
MuGER^2: Multi-Granularity Evidence Retrieval and Reasoning for Hybrid Question AnsweringCode2
WebShop: Towards Scalable Real-World Web Interaction with Grounded Language AgentsCode2
LMVD: A Large-Scale Multimodal Vlog Dataset for Depression Detection in the WildCode2
Controllable and Reliable Knowledge-Intensive Task-Oriented Conversational Agents with Declarative Genie WorksheetsCode2
LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an AgentCode2
NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled ReasoningCode2
Sim-to-Real Transfer via 3D Feature Fields for Vision-and-Language NavigationCode2
Revisit Anything: Visual Place Recognition via Image Segment RetrievalCode2
Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human InteractionsCode2
Holodeck: Language Guided Generation of 3D Embodied AI EnvironmentsCode2
Joint Perception and Prediction for Autonomous Driving: A SurveyCode2
Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and MemoryCode2
AlphaMaze: Enhancing Large Language Models' Spatial Intelligence via GRPOCode2
GOAT-Bench: A Benchmark for Multi-Modal Lifelong NavigationCode2
Large Language Models(LLMs) on Tabular Data: Prediction, Generation, and Understanding -- A SurveyCode2
FLAME: Learning to Navigate with Multimodal LLM in Urban EnvironmentsCode2
AI Research Agents for Machine Learning: Search, Exploration, and Generalization in MLE-benchCode2
ForesightNav: Learning Scene Imagination for Efficient ExplorationCode2
Generative Artificial Intelligence for Navigating Synthesizable Chemical SpaceCode2
Dynam3D: Dynamic Layered 3D Tokens Empower VLM for Vision-and-Language NavigationCode2
GUI Odyssey: A Comprehensive Dataset for Cross-App GUI Navigation on Mobile DevicesCode2
Event-based Stereo Depth Estimation: A SurveyCode2
Imagine Before Go: Self-Supervised Generative Map for Object Goal NavigationCode2
ETPNav: Evolving Topological Planning for Vision-Language Navigation in Continuous EnvironmentsCode2
DriVLMe: Enhancing LLM-based Autonomous Driving Agents with Embodied and Social ExperiencesCode2
DualMap: Online Open-Vocabulary Semantic Mapping for Natural Language Navigation in Dynamic Changing ScenesCode2
Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive SurveyCode2
Lookahead Exploration with Neural Radiance Representation for Continuous Vision-Language NavigationCode2
MAGE: A Multi-Agent Engine for Automated RTL Code GenerationCode2
AdaVLN: Towards Visual Language Navigation in Continuous Indoor Environments with Moving HumansCode2
ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level CodeCode2
Enhance Then Search: An Augmentation-Search Strategy with Foundation Models for Cross-Domain Few-Shot Object DetectionCode2
From Cognition to Precognition: A Future-Aware Framework for Social NavigationCode2
AerialVLN: Vision-and-Language Navigation for UAVsCode2
OpenFMNav: Towards Open-Set Zero-Shot Object Navigation via Vision-Language Foundation ModelsCode2
Learning Efficient and Effective Trajectories for Differential Equation-based Image RestorationCode2
Panoptic nuScenes: A Large-Scale Benchmark for LiDAR Panoptic Segmentation and TrackingCode2
PLAYER*: Enhancing LLM-based Multi-Agent Communication and Interaction in Murder Mystery GamesCode2
Adaptive Risk-Tendency: Nano Drone Navigation in Cluttered Environments with Distributional Reinforcement LearningCode1
Can GPT-4 Perform Neural Architecture Search?Code1
Differentiable Agent-based EpidemiologyCode1
DISCO: Embodied Navigation and Interaction via Differentiable Scene Semantics and Dual-level ControlCode1
BioImage.IO Chatbot: A Community-Driven AI Assistant for Integrative Computational BioimagingCode1
Can Large Language Models be Good Path Planners? A Benchmark and Investigation on Spatial-temporal ReasoningCode1
DFR-FastMOT: Detection Failure Resistant Tracker for Fast Multi-Object Tracking Based on Sensor FusionCode1
Show:102550
← PrevPage 2 of 40Next →

No leaderboard results yet.