| Efficient multi-prompt evaluation of LLMs | May 27, 2024 | MMLU | CodeCode Available | 7 |
| TTRL: Test-Time Reinforcement Learning | Apr 22, 2025 | Mathreinforcement-learning | CodeCode Available | 7 |
| Elixir: Train a Large Language Model on a Small GPU Cluster | Dec 10, 2022 | CPUGPU | CodeCode Available | 7 |
| Gradient-Based Multi-Objective Deep Learning: Algorithms, Theories, Applications, and Beyond | Jan 19, 2025 | Deep LearningMulti-Task Learning | CodeCode Available | 7 |
| PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding | Apr 17, 2025 | Video Question AnsweringVideo Understanding | CodeCode Available | 7 |
| Tulu 3: Pushing Frontiers in Open Language Model Post-Training | Nov 22, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 7 |
| Measuring Massive Multitask Chinese Understanding | Apr 25, 2023 | All | CodeCode Available | 7 |
| YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors | Jul 6, 2022 | 2D Object DetectionGPU | CodeCode Available | 7 |
| FoundationStereo: Zero-Shot Stereo Matching | Jan 17, 2025 | Depth EstimationDiversity | CodeCode Available | 7 |
| Mirage: A Multi-Level Superoptimizer for Tensor Programs | May 9, 2024 | GPUNavigate | CodeCode Available | 7 |
| TimeXer: Empowering Transformers for Time Series Forecasting with Exogenous Variables | Feb 29, 2024 | Time SeriesTime Series Forecasting | CodeCode Available | 7 |
| Visual Agentic Reinforcement Fine-Tuning | May 20, 2025 | Image Manipulation | CodeCode Available | 7 |
| Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection | May 16, 2024 | Edge-computingFew-Shot Object Detection | CodeCode Available | 7 |
| Align Anything: Training All-Modality Models to Follow Instructions with Language Feedback | Dec 20, 2024 | AllInstruction Following | CodeCode Available | 7 |
| LLaVA-NeXT-Interleave: Tackling Multi-image, Video, and 3D in Large Multimodal Models | Jul 10, 2024 | Video Question AnsweringZero-Shot Video Question Answer | CodeCode Available | 7 |
| Measuring short-form factuality in large language models | Nov 7, 2024 | Form | CodeCode Available | 7 |
| RedPajama: an Open Dataset for Training Large Language Models | Nov 19, 2024 | | CodeCode Available | 7 |
| Reinforcement Learning Optimization for Large-Scale Learning: An Efficient and User-Friendly Scaling Library | Jun 6, 2025 | Management | CodeCode Available | 7 |
| BrowseComp: A Simple Yet Challenging Benchmark for Browsing Agents | Apr 16, 2025 | | CodeCode Available | 7 |
| Easy Begun is Half Done: Spatial-Temporal Graph Modeling with ST-Curriculum Dropout | Nov 28, 2022 | | CodeCode Available | 7 |
| Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning | Apr 24, 2025 | Code Generation | CodeCode Available | 7 |
| On the Vulnerability of LLM/VLM-Controlled Robotics | Feb 15, 2024 | Language ModellingRobot Manipulation | CodeCode Available | 7 |
| OpenAssistant Conversations - Democratizing Large Language Model Alignment | Sep 26, 2023 | | CodeCode Available | 7 |
| Grounding Image Matching in 3D with MASt3R | Jun 14, 2024 | 3D Reconstruction | CodeCode Available | 7 |
| PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides | Jan 7, 2025 | | CodeCode Available | 7 |
| VACE: All-in-One Video Creation and Editing | Mar 10, 2025 | AllHuman-Domain Subject-to-Video | CodeCode Available | 7 |
| Revisiting PCA for time series reduction in temporal dimension | Dec 27, 2024 | Computational EfficiencyDimensionality Reduction | CodeCode Available | 7 |
| Pyramidal Flow Matching for Efficient Video Generative Modeling | Oct 8, 2024 | GPUText-to-Video Generation | CodeCode Available | 7 |
| Marigold: Affordable Adaptation of Diffusion-Based Image Generators for Image Analysis | May 14, 2025 | DenoisingDepth Estimation | CodeCode Available | 7 |
| TextGrad: Automatic "Differentiation" via Text | Jun 11, 2024 | Question AnsweringSpecificity | CodeCode Available | 7 |
| RAGEN: Understanding Self-Evolution in LLM Agents via Multi-Turn Reinforcement Learning | Apr 24, 2025 | Decision MakingReinforcement Learning (RL) | CodeCode Available | 7 |
| Flow-GRPO: Training Flow Matching Models via Online RL | May 8, 2025 | DenoisingDiversity | CodeCode Available | 7 |
| AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning | May 30, 2025 | GPUMath | CodeCode Available | 7 |
| Pre^3: Enabling Deterministic Pushdown Automata for Faster Structured LLM Generation | Jun 4, 2025 | | CodeCode Available | 7 |
| DeepSeek-VL: Towards Real-World Vision-Language Understanding | Mar 8, 2024 | ChatbotLanguage Modelling | CodeCode Available | 7 |
| Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability | May 27, 2024 | Autonomous DrivingVideo Generation | CodeCode Available | 7 |
| Grants4Companies: Applying Declarative Methods for Recommending and Reasoning About Business Grants in the Austrian Public Administration (System Description) | Jun 21, 2024 | | CodeCode Available | 7 |
| InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models | Apr 10, 2024 | Image to 3D | CodeCode Available | 7 |
| PEER: Expertizing Domain-Specific Tasks with a Multi-Agent Framework and Tuning Methods | Jul 9, 2024 | Information RetrievalLEMMA | CodeCode Available | 7 |
| Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering | Jan 16, 2024 | Code GenerationPrompt Engineering | CodeCode Available | 7 |
| Dynamic Evaluation of Large Language Models by Meta Probing Agents | Feb 21, 2024 | Data Augmentation | CodeCode Available | 7 |
| Better Synthetic Data by Retrieving and Transforming Existing Datasets | Apr 22, 2024 | Dataset GenerationDiversity | CodeCode Available | 7 |
| Metric3Dv2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation | Mar 22, 2024 | Depth EstimationSurface Normal Estimation | CodeCode Available | 7 |
| From RAG to Memory: Non-Parametric Continual Learning for Large Language Models | Feb 20, 2025 | Continual LearningKnowledge Graphs | CodeCode Available | 7 |
| AIOS Compiler: LLM as Interpreter for Natural Language Programming and Flow Programming of AI Agents | May 11, 2024 | | CodeCode Available | 7 |
| Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP | Dec 28, 2022 | In-Context LearningLanguage Modelling | CodeCode Available | 7 |
| ECCO: Can We Improve Model-Generated Code Efficiency Without Sacrificing Functional Correctness? | Jul 19, 2024 | BenchmarkingCode Generation | CodeCode Available | 7 |
| mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models | Aug 9, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 7 |
| MemoRAG: Moving towards Next-Gen RAG Via Memory-Inspired Knowledge Discovery | Sep 9, 2024 | MemorizationQuestion Answering | CodeCode Available | 7 |
| PyRIT: A Framework for Security Risk Identification and Red Teaming in Generative AI System | Oct 1, 2024 | Red Teaming | CodeCode Available | 7 |