| Optimizing Instructions and Demonstrations for Multi-Stage Language Model Programs | Jun 17, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 14 | 5 |
| SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering | May 6, 2024 | Bug fixingLanguage Modeling | CodeCode Available | 11 | 5 |
| Data Formulator 2: Iterative Creation of Data Visualizations, with AI Transforming Data Along the Way | Aug 28, 2024 | Code GenerationNavigate | CodeCode Available | 11 | 5 |
| UFO: A UI-Focused Agent for Windows OS Interaction | Feb 8, 2024 | Navigate | CodeCode Available | 9 | 5 |
| Mirage: A Multi-Level Superoptimizer for Tensor Programs | May 9, 2024 | GPUNavigate | CodeCode Available | 7 | 5 |
| Training Compute-Optimal Large Language Models | Mar 29, 2022 | AnachronismsAnalogical Similarity | CodeCode Available | 6 | 5 |
| Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution | Jul 12, 2023 | FairnessImage Classification | CodeCode Available | 6 | 5 |
| IntellAgent: A Multi-Agent Framework for Evaluating Conversational AI Systems | Jan 19, 2025 | Navigate | CodeCode Available | 5 | 5 |
| AppAgent: Multimodal Agents as Smartphone Users | Dec 21, 2023 | Navigate | CodeCode Available | 5 | 5 |
| ChatDBG: Augmenting Debugging with Large Language Models | Mar 25, 2024 | C++ codeNavigate | CodeCode Available | 5 | 5 |
| WebThinker: Empowering Large Reasoning Models with Deep Research Capability | Apr 30, 2025 | Navigate | CodeCode Available | 5 | 5 |
| LocAgent: Graph-Guided LLM Agents for Code Localization | Mar 12, 2025 | GitHub issue resolutionNavigate | CodeCode Available | 4 | 5 |
| EvoX: A Distributed GPU-accelerated Framework for Scalable Evolutionary Computation | Jan 29, 2023 | GPUNavigate | CodeCode Available | 4 | 5 |
| DeepResearcher: Scaling Deep Research via Reinforcement Learning in Real-world Environments | Apr 4, 2025 | NavigatePrompt Engineering | CodeCode Available | 4 | 5 |
| Diffusion Models for Medical Image Analysis: A Comprehensive Survey | Nov 14, 2022 | DenoisingMedical Image Analysis | CodeCode Available | 4 | 5 |
| GPUDrive: Data-driven, multi-agent driving simulation at 1 million FPS | Aug 2, 2024 | GPUNavigate | CodeCode Available | 4 | 5 |
| VLN-R1: Vision-Language Navigation via Reinforcement Fine-Tuning | Jun 20, 2025 | NavigateVision-Language Navigation | CodeCode Available | 4 | 5 |
| RL4CO: an Extensive Reinforcement Learning for Combinatorial Optimization Benchmark | Jun 29, 2023 | Combinatorial OptimizationComputational Efficiency | CodeCode Available | 4 | 5 |
| From Automation to Autonomy: A Survey on Large Language Models in Scientific Discovery | May 19, 2025 | Navigatescientific discovery | CodeCode Available | 3 | 5 |
| A Practical Review of Mechanistic Interpretability for Transformer-Based Language Models | Jul 2, 2024 | Navigate | CodeCode Available | 3 | 5 |
| Computation-Efficient Era: A Comprehensive Survey of State Space Models in Medical Image Analysis | Jun 5, 2024 | MambaMedical Image Analysis | CodeCode Available | 3 | 5 |
| Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents | Oct 7, 2024 | Natural Language Visual GroundingNavigate | CodeCode Available | 3 | 5 |
| CarDreamer: Open-Source Learning Platform for World Model based Autonomous Driving | May 15, 2024 | Autonomous DrivingAutonomous Vehicles | CodeCode Available | 3 | 5 |
| Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction | Dec 5, 2024 | Multimodal ReasoningNatural Language Visual Grounding | CodeCode Available | 3 | 5 |
| Large Language Models(LLMs) on Tabular Data: Prediction, Generation, and Understanding -- A Survey | Feb 27, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 2 | 5 |
| AlphaMaze: Enhancing Large Language Models' Spatial Intelligence via GRPO | Feb 20, 2025 | Autonomous NavigationNavigate | CodeCode Available | 2 | 5 |
| Imagine Before Go: Self-Supervised Generative Map for Object Goal Navigation | Jan 1, 2024 | General KnowledgeNavigate | CodeCode Available | 2 | 5 |
| Learning Efficient and Effective Trajectories for Differential Equation-based Image Restoration | Oct 7, 2024 | Image RestorationNavigate | CodeCode Available | 2 | 5 |
| GUI Odyssey: A Comprehensive Dataset for Cross-App GUI Navigation on Mobile Devices | Jun 12, 2024 | Navigate | CodeCode Available | 2 | 5 |
| Joint Perception and Prediction for Autonomous Driving: A Survey | Dec 18, 2024 | Autonomous Drivingmotion prediction | CodeCode Available | 2 | 5 |
| GOAT-Bench: A Benchmark for Multi-Modal Lifelong Navigation | Apr 9, 2024 | Go to AnyThingNavigate | CodeCode Available | 2 | 5 |
| Holodeck: Language Guided Generation of 3D Embodied AI Environments | Dec 14, 2023 | Common Sense ReasoningLanguage Modelling | CodeCode Available | 2 | 5 |
| Generative Artificial Intelligence for Navigating Synthesizable Chemical Space | Oct 4, 2024 | Drug DiscoveryNavigate | CodeCode Available | 2 | 5 |
| From Cognition to Precognition: A Future-Aware Framework for Social Navigation | Sep 20, 2024 | Future predictionNavigate | CodeCode Available | 2 | 5 |
| Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory | May 25, 2023 | Common Sense ReasoningCPU | CodeCode Available | 2 | 5 |
| Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions | Jun 27, 2024 | NavigateVision and Language Navigation | CodeCode Available | 2 | 5 |
| Controllable and Reliable Knowledge-Intensive Task-Oriented Conversational Agents with Declarative Genie Worksheets | Jul 8, 2024 | HallucinationNavigate | CodeCode Available | 2 | 5 |
| ETPNav: Evolving Topological Planning for Vision-Language Navigation in Continuous Environments | Apr 6, 2023 | Autonomous NavigationNavigate | CodeCode Available | 2 | 5 |
| DriVLMe: Enhancing LLM-based Autonomous Driving Agents with Embodied and Social Experiences | Jun 5, 2024 | Autonomous DrivingAutonomous Vehicles | CodeCode Available | 2 | 5 |
| Event-based Stereo Depth Estimation: A Survey | Sep 26, 2024 | Depth EstimationNavigate | CodeCode Available | 2 | 5 |
| DeFoG: Discrete Flow Matching for Graph Generation | Oct 5, 2024 | DenoisingGraph Generation | CodeCode Available | 2 | 5 |
| Navigate through Enigmatic Labyrinth A Survey of Chain of Thought Reasoning: Advances, Frontiers and Future | Sep 27, 2023 | Navigate | CodeCode Available | 2 | 5 |
| Demystifying AI Platform Design for Distributed Inference of Next-Generation LLM models | Jun 3, 2024 | ChunkingMamba | CodeCode Available | 2 | 5 |
| Diffusion Models for Molecules: A Survey of Methods and Tasks | Feb 13, 2025 | DiversityDrug Discovery | CodeCode Available | 2 | 5 |
| DualMap: Online Open-Vocabulary Semantic Mapping for Natural Language Navigation in Dynamic Changing Scenes | Jun 2, 2025 | Natural Language QueriesNavigate | CodeCode Available | 2 | 5 |
| Enhance Then Search: An Augmentation-Search Strategy with Foundation Models for Cross-Domain Few-Shot Object Detection | Apr 6, 2025 | Cross-Domain Few-ShotCross-Domain Few-Shot Object Detection | CodeCode Available | 2 | 5 |
| ForesightNav: Learning Scene Imagination for Efficient Exploration | Apr 22, 2025 | Efficient ExplorationNavigate | CodeCode Available | 2 | 5 |
| DayDreamer: World Models for Physical Robot Learning | Jun 28, 2022 | Deep Reinforcement LearningNavigate | CodeCode Available | 2 | 5 |
| AI Research Agents for Machine Learning: Search, Exploration, and Generalization in MLE-bench | Jul 3, 2025 | Navigate | CodeCode Available | 2 | 5 |
| AerialVLN: Vision-and-Language Navigation for UAVs | Aug 13, 2023 | cross-modal alignmentNavigate | CodeCode Available | 2 | 5 |