SOTAVerified

Navigate

Papers

Showing 150 of 1982 papers

TitleStatusHype
Optimizing Instructions and Demonstrations for Multi-Stage Language Model ProgramsCode14
SWE-agent: Agent-Computer Interfaces Enable Automated Software EngineeringCode11
Data Formulator 2: Iterative Creation of Data Visualizations, with AI Transforming Data Along the WayCode11
UFO: A UI-Focused Agent for Windows OS InteractionCode9
Mirage: A Multi-Level Superoptimizer for Tensor ProgramsCode7
Training Compute-Optimal Large Language ModelsCode6
Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and ResolutionCode6
IntellAgent: A Multi-Agent Framework for Evaluating Conversational AI SystemsCode5
ChatDBG: Augmenting Debugging with Large Language ModelsCode5
AppAgent: Multimodal Agents as Smartphone UsersCode5
WebThinker: Empowering Large Reasoning Models with Deep Research CapabilityCode5
RL4CO: an Extensive Reinforcement Learning for Combinatorial Optimization BenchmarkCode4
Diffusion Models for Medical Image Analysis: A Comprehensive SurveyCode4
VLN-R1: Vision-Language Navigation via Reinforcement Fine-TuningCode4
GPUDrive: Data-driven, multi-agent driving simulation at 1 million FPSCode4
LocAgent: Graph-Guided LLM Agents for Code LocalizationCode4
EvoX: A Distributed GPU-accelerated Framework for Scalable Evolutionary ComputationCode4
DeepResearcher: Scaling Deep Research via Reinforcement Learning in Real-world EnvironmentsCode4
From Automation to Autonomy: A Survey on Large Language Models in Scientific DiscoveryCode3
Computation-Efficient Era: A Comprehensive Survey of State Space Models in Medical Image AnalysisCode3
CarDreamer: Open-Source Learning Platform for World Model based Autonomous DrivingCode3
Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI AgentsCode3
A Practical Review of Mechanistic Interpretability for Transformer-Based Language ModelsCode3
Aguvis: Unified Pure Vision Agents for Autonomous GUI InteractionCode3
Large Language Models(LLMs) on Tabular Data: Prediction, Generation, and Understanding -- A SurveyCode2
Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human InteractionsCode2
A vision-based autonomous UAV inspection framework for unknown tunnel construction sites with dynamic obstaclesCode2
Imagine Before Go: Self-Supervised Generative Map for Object Goal NavigationCode2
Learning Efficient and Effective Trajectories for Differential Equation-based Image RestorationCode2
Joint Perception and Prediction for Autonomous Driving: A SurveyCode2
GOAT-Bench: A Benchmark for Multi-Modal Lifelong NavigationCode2
Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and MemoryCode2
GUI Odyssey: A Comprehensive Dataset for Cross-App GUI Navigation on Mobile DevicesCode2
From Cognition to Precognition: A Future-Aware Framework for Social NavigationCode2
ForesightNav: Learning Scene Imagination for Efficient ExplorationCode2
Generative Artificial Intelligence for Navigating Synthesizable Chemical SpaceCode2
Holodeck: Language Guided Generation of 3D Embodied AI EnvironmentsCode2
Controllable and Reliable Knowledge-Intensive Task-Oriented Conversational Agents with Declarative Genie WorksheetsCode2
Enhance Then Search: An Augmentation-Search Strategy with Foundation Models for Cross-Domain Few-Shot Object DetectionCode2
AlphaMaze: Enhancing Large Language Models' Spatial Intelligence via GRPOCode2
Diffusion Models for Molecules: A Survey of Methods and TasksCode2
ETPNav: Evolving Topological Planning for Vision-Language Navigation in Continuous EnvironmentsCode2
Demystifying AI Platform Design for Distributed Inference of Next-Generation LLM modelsCode2
DeFoG: Discrete Flow Matching for Graph GenerationCode2
DriVLMe: Enhancing LLM-based Autonomous Driving Agents with Embodied and Social ExperiencesCode2
DualMap: Online Open-Vocabulary Semantic Mapping for Natural Language Navigation in Dynamic Changing ScenesCode2
Navigate through Enigmatic Labyrinth A Survey of Chain of Thought Reasoning: Advances, Frontiers and FutureCode2
FLAME: Learning to Navigate with Multimodal LLM in Urban EnvironmentsCode2
Depicting Beyond Scores: Advancing Image Quality Assessment through Multi-modal Language ModelsCode2
AI Research Agents for Machine Learning: Search, Exploration, and Generalization in MLE-benchCode2
Show:102550
← PrevPage 1 of 40Next →

No leaderboard results yet.