SOTAVerified

Navigate

Papers

Showing 51100 of 1982 papers

TitleStatusHype
Sim-to-Real Transfer via 3D Feature Fields for Vision-and-Language NavigationCode2
GUI Odyssey: A Comprehensive Dataset for Cross-App GUI Navigation on Mobile DevicesCode2
DriVLMe: Enhancing LLM-based Autonomous Driving Agents with Embodied and Social ExperiencesCode2
XRec: Large Language Models for Explainable RecommendationCode2
Demystifying AI Platform Design for Distributed Inference of Next-Generation LLM modelsCode2
Can Graph Learning Improve Planning in LLM-based Agents?Code2
LMVD: A Large-Scale Multimodal Vlog Dataset for Depression Detection in the WildCode2
PLAYER*: Enhancing LLM-based Multi-Agent Communication and Interaction in Murder Mystery GamesCode2
The PRISM Alignment Dataset: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language ModelsCode2
Can Vehicle Motion Planning Generalize to Realistic Long-tail Scenarios?Code2
GOAT-Bench: A Benchmark for Multi-Modal Lifelong NavigationCode2
Lookahead Exploration with Neural Radiance Representation for Continuous Vision-Language NavigationCode2
Volumetric Environment Representation for Vision-Language NavigationCode2
NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled ReasoningCode2
Large Language Models(LLMs) on Tabular Data: Prediction, Generation, and Understanding -- A SurveyCode2
OpenFMNav: Towards Open-Set Zero-Shot Object Navigation via Vision-Language Foundation ModelsCode2
Imagine Before Go: Self-Supervised Generative Map for Object Goal NavigationCode2
Depicting Beyond Scores: Advancing Image Quality Assessment through Multi-modal Language ModelsCode2
Holodeck: Language Guided Generation of 3D Embodied AI EnvironmentsCode2
VLFM: Vision-Language Frontier Maps for Zero-Shot Semantic NavigationCode2
Towards Learning a Generalist Model for Embodied NavigationCode2
Tree of Attacks: Jailbreaking Black-Box LLMs AutomaticallyCode2
Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive SurveyCode2
ML-Bench: Evaluating Large Language Models and Agents for Machine Learning Tasks on Repository-Level CodeCode2
PromptAgent: Strategic Planning with Language Models Enables Expert-level Prompt OptimizationCode2
Navigate through Enigmatic Labyrinth A Survey of Chain of Thought Reasoning: Advances, Frontiers and FutureCode2
VidChapters-7M: Video Chapters at ScaleCode2
LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an AgentCode2
AerialVLN: Vision-and-Language Navigation for UAVsCode2
WizMap: Scalable Interactive Visualization for Exploring Large Machine Learning EmbeddingsCode2
Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and MemoryCode2
ETPNav: Evolving Topological Planning for Vision-Language Navigation in Continuous EnvironmentsCode2
A vision-based autonomous UAV inspection framework for unknown tunnel construction sites with dynamic obstaclesCode2
Melting Pot 2.0Code2
MuGER^2: Multi-Granularity Evidence Retrieval and Reasoning for Hybrid Question AnsweringCode2
Vision-aided UAV navigation and dynamic obstacle avoidance using gradient-based B-spline trajectory optimizationCode2
WebShop: Towards Scalable Real-World Web Interaction with Grounded Language AgentsCode2
DayDreamer: World Models for Physical Robot LearningCode2
VectorMapNet: End-to-end Vectorized HD Map LearningCode2
Receding Moving Object Segmentation in 3D LiDAR Data Using Sparse 4D ConvolutionsCode2
Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language NavigationCode2
Scaling Language Models: Methods, Analysis & Insights from Training GopherCode2
Panoptic nuScenes: A Large-Scale Benchmark for LiDAR Panoptic Segmentation and TrackingCode2
The Decrypto Benchmark for Multi-Agent Reasoning and Theory of MindCode1
SEMNAV: A Semantic Segmentation-Driven Approach to Visual Semantic NavigationCode1
Cross from Left to Right Brain: Adaptive Text Dreamer for Vision-and-Language NavigationCode1
Large Language Models for Planning: A Comprehensive and Systematic SurveyCode1
Neural Brain: A Neuroscience-inspired Framework for Embodied AgentsCode1
CityNavAgent: Aerial Vision-and-Language Navigation with Hierarchical Semantic Planning and Global MemoryCode1
Future-Oriented Navigation: Dynamic Obstacle Avoidance with One-Shot Energy-Based Multimodal Motion PredictionCode1
Show:102550
← PrevPage 2 of 40Next →

No leaderboard results yet.