| Optimizing Instructions and Demonstrations for Multi-Stage Language Model Programs | Jun 17, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 14 |
| Data Formulator 2: Iterative Creation of Data Visualizations, with AI Transforming Data Along the Way | Aug 28, 2024 | Code GenerationNavigate | CodeCode Available | 11 |
| SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering | May 6, 2024 | Bug fixingLanguage Modeling | CodeCode Available | 11 |
| UFO: A UI-Focused Agent for Windows OS Interaction | Feb 8, 2024 | Navigate | CodeCode Available | 9 |
| Mirage: A Multi-Level Superoptimizer for Tensor Programs | May 9, 2024 | GPUNavigate | CodeCode Available | 7 |
| Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution | Jul 12, 2023 | FairnessImage Classification | CodeCode Available | 6 |
| Training Compute-Optimal Large Language Models | Mar 29, 2022 | AnachronismsAnalogical Similarity | CodeCode Available | 6 |
| WebThinker: Empowering Large Reasoning Models with Deep Research Capability | Apr 30, 2025 | Navigate | CodeCode Available | 5 |
| IntellAgent: A Multi-Agent Framework for Evaluating Conversational AI Systems | Jan 19, 2025 | Navigate | CodeCode Available | 5 |
| ChatDBG: Augmenting Debugging with Large Language Models | Mar 25, 2024 | C++ codeNavigate | CodeCode Available | 5 |
| AppAgent: Multimodal Agents as Smartphone Users | Dec 21, 2023 | Navigate | CodeCode Available | 5 |
| VLN-R1: Vision-Language Navigation via Reinforcement Fine-Tuning | Jun 20, 2025 | NavigateVision-Language Navigation | CodeCode Available | 4 |
| DeepResearcher: Scaling Deep Research via Reinforcement Learning in Real-world Environments | Apr 4, 2025 | NavigatePrompt Engineering | CodeCode Available | 4 |
| LocAgent: Graph-Guided LLM Agents for Code Localization | Mar 12, 2025 | GitHub issue resolutionNavigate | CodeCode Available | 4 |
| GPUDrive: Data-driven, multi-agent driving simulation at 1 million FPS | Aug 2, 2024 | GPUNavigate | CodeCode Available | 4 |
| RL4CO: an Extensive Reinforcement Learning for Combinatorial Optimization Benchmark | Jun 29, 2023 | Combinatorial OptimizationComputational Efficiency | CodeCode Available | 4 |
| EvoX: A Distributed GPU-accelerated Framework for Scalable Evolutionary Computation | Jan 29, 2023 | GPUNavigate | CodeCode Available | 4 |
| Diffusion Models for Medical Image Analysis: A Comprehensive Survey | Nov 14, 2022 | DenoisingMedical Image Analysis | CodeCode Available | 4 |
| From Automation to Autonomy: A Survey on Large Language Models in Scientific Discovery | May 19, 2025 | Navigatescientific discovery | CodeCode Available | 3 |
| Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction | Dec 5, 2024 | Multimodal ReasoningNatural Language Visual Grounding | CodeCode Available | 3 |
| Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents | Oct 7, 2024 | Natural Language Visual GroundingNavigate | CodeCode Available | 3 |
| A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models | Jul 2, 2024 | Navigate | CodeCode Available | 3 |
| Computation-Efficient Era: A Comprehensive Survey of State Space Models in Medical Image Analysis | Jun 5, 2024 | MambaMedical Image Analysis | CodeCode Available | 3 |
| CarDreamer: Open-Source Learning Platform for World Model based Autonomous Driving | May 15, 2024 | Autonomous DrivingAutonomous Vehicles | CodeCode Available | 3 |
| AI Research Agents for Machine Learning: Search, Exploration, and Generalization in MLE-bench | Jul 3, 2025 | Navigate | CodeCode Available | 2 |
| DualMap: Online Open-Vocabulary Semantic Mapping for Natural Language Navigation in Dynamic Changing Scenes | Jun 2, 2025 | Natural Language QueriesNavigate | CodeCode Available | 2 |
| Dynam3D: Dynamic Layered 3D Tokens Empower VLM for Vision-and-Language Navigation | May 16, 2025 | 3D geometryNavigate | CodeCode Available | 2 |
| ForesightNav: Learning Scene Imagination for Efficient Exploration | Apr 22, 2025 | Efficient ExplorationNavigate | CodeCode Available | 2 |
| Enhance Then Search: An Augmentation-Search Strategy with Foundation Models for Cross-Domain Few-Shot Object Detection | Apr 6, 2025 | Cross-Domain Few-ShotCross-Domain Few-Shot Object Detection | CodeCode Available | 2 |
| MTGS: Multi-Traversal Gaussian Splatting | Mar 16, 2025 | NavigateNovel View Synthesis | CodeCode Available | 2 |
| Real-time Spatial-temporal Traversability Assessment via Feature-based Sparse Gaussian Process | Mar 6, 2025 | Autonomous NavigationComputational Efficiency | CodeCode Available | 2 |
| BEVDriver: Leveraging BEV Maps in LLMs for Robust Closed-Loop Driving | Mar 5, 2025 | Autonomous DrivingMotion Planning | CodeCode Available | 2 |
| AlphaMaze: Enhancing Large Language Models' Spatial Intelligence via GRPO | Feb 20, 2025 | Autonomous NavigationNavigate | CodeCode Available | 2 |
| NavRAG: Generating User Demand Instructions for Embodied Navigation through Retrieval-Augmented LLM | Feb 16, 2025 | NavigateRAG | CodeCode Available | 2 |
| Diffusion Models for Molecules: A Survey of Methods and Tasks | Feb 13, 2025 | DiversityDrug Discovery | CodeCode Available | 2 |
| Joint Perception and Prediction for Autonomous Driving: A Survey | Dec 18, 2024 | Autonomous Drivingmotion prediction | CodeCode Available | 2 |
| MAGE: A Multi-Agent Engine for Automated RTL Code Generation | Dec 10, 2024 | Code GenerationNavigate | CodeCode Available | 2 |
| AdaVLN: Towards Visual Language Navigation in Continuous Indoor Environments with Moving Humans | Nov 27, 2024 | Navigate | CodeCode Available | 2 |
| Real-Time Polygonal Semantic Mapping for Humanoid Robot Stair Climbing | Nov 4, 2024 | Computational EfficiencyGPU | CodeCode Available | 2 |
| Learning Efficient and Effective Trajectories for Differential Equation-based Image Restoration | Oct 7, 2024 | Image RestorationNavigate | CodeCode Available | 2 |
| DeFoG: Discrete Flow Matching for Graph Generation | Oct 5, 2024 | DenoisingGraph Generation | CodeCode Available | 2 |
| Generative Artificial Intelligence for Navigating Synthesizable Chemical Space | Oct 4, 2024 | Drug DiscoveryNavigate | CodeCode Available | 2 |
| Open-RAG: Enhanced Retrieval-Augmented Reasoning with Open-Source Large Language Models | Oct 2, 2024 | Mixture-of-ExpertsNavigate | CodeCode Available | 2 |
| Revisit Anything: Visual Place Recognition via Image Segment Retrieval | Sep 26, 2024 | Image SegmentationNavigate | CodeCode Available | 2 |
| Event-based Stereo Depth Estimation: A Survey | Sep 26, 2024 | Depth EstimationNavigate | CodeCode Available | 2 |
| From Cognition to Precognition: A Future-Aware Framework for Social Navigation | Sep 20, 2024 | Future predictionNavigate | CodeCode Available | 2 |
| FLAME: Learning to Navigate with Multimodal LLM in Urban Environments | Aug 20, 2024 | NavigateVision and Language Navigation | CodeCode Available | 2 |
| Controllable and Reliable Knowledge-Intensive Task-Oriented Conversational Agents with Declarative Genie Worksheets | Jul 8, 2024 | HallucinationNavigate | CodeCode Available | 2 |
| Text2Robot: Evolutionary Robot Design from Text Descriptions | Jun 28, 2024 | NavigateText to 3D | CodeCode Available | 2 |
| Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions | Jun 27, 2024 | NavigateVision and Language Navigation | CodeCode Available | 2 |