| Information Flow Routes: Automatically Interpreting Language Models at Scale | Feb 27, 2024 | | CodeCode Available | 5 | 5 |
| Bridging Different Language Models and Generative Vision Models for Text-to-Image Generation | Mar 12, 2024 | Image GenerationLanguage Modelling | CodeCode Available | 5 | 5 |
| UniDepth: Universal Monocular Metric Depth Estimation | Mar 27, 2024 | Depth EstimationMonocular Depth Estimation | CodeCode Available | 5 | 5 |
| Unleashing the Potential of SAM2 for Biomedical Images and Videos: A Survey | Aug 23, 2024 | Image SegmentationSegmentation | CodeCode Available | 5 | 5 |
| AssetOpsBench: Benchmarking AI Agents for Task Automation in Industrial Asset Operations and Maintenance | Jun 4, 2025 | BenchmarkingScheduling | CodeCode Available | 5 | 5 |
| DeTikZify: Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ | May 24, 2024 | Language ModelingLanguage Modelling | CodeCode Available | 5 | 5 |
| Noisereduce: Domain General Noise Reduction for Time Series Signals | Dec 19, 2024 | Time Series | CodeCode Available | 5 | 5 |
| Evaluating Real-World Robot Manipulation Policies in Simulation | May 9, 2024 | Robotic GraspingRobot Manipulation | CodeCode Available | 5 | 5 |
| LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model | Apr 28, 2023 | Instruction Followingmodel | CodeCode Available | 5 | 5 |
| Orbit: A Unified Simulation Framework for Interactive Robot Learning Environments | Jan 10, 2023 | GPUImitation Learning | CodeCode Available | 5 | 5 |
| ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models | May 30, 2025 | Reinforcement Learning (RL) | CodeCode Available | 5 | 5 |
| WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct | Aug 18, 2023 | Arithmetic ReasoningGSM8K | CodeCode Available | 5 | 5 |
| Break the Sequential Dependency of LLM Inference Using Lookahead Decoding | Feb 3, 2024 | Code Completion | CodeCode Available | 5 | 5 |
| Allegro: Open the Black Box of Commercial-Level Video Generation Model | Oct 20, 2024 | Video Generation | CodeCode Available | 5 | 5 |
| Show-o: One Single Transformer to Unify Multimodal Understanding and Generation | Aug 22, 2024 | 10-shot image generation | CodeCode Available | 5 | 5 |
| VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild | Nov 27, 2022 | Video EditingVideo Generation | CodeCode Available | 5 | 5 |
| XFeat: Accelerated Features for Lightweight Image Matching | Apr 30, 2024 | CPUKeypoint detection and image matching | CodeCode Available | 5 | 5 |
| Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent | Nov 4, 2024 | Logical ReasoningMathematical Problem-Solving | CodeCode Available | 5 | 5 |
| ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment | Mar 8, 2024 | DenoisingImage Generation | CodeCode Available | 5 | 5 |
| ShareGPT4Video: Improving Video Understanding and Generation with Better Captions | Jun 6, 2024 | Video CaptioningVideo Generation | CodeCode Available | 5 | 5 |
| Video Depth Anything: Consistent Depth Estimation for Super-Long Videos | Jan 21, 2025 | Computational EfficiencyDepth Estimation | CodeCode Available | 5 | 5 |
| Fast Inference from Transformers via Speculative Decoding | Nov 30, 2022 | Language ModelingLanguage Modelling | CodeCode Available | 5 | 5 |
| TrimTail: Low-Latency Streaming ASR with Simple but Effective Spectrogram-Level Length Penalty | Nov 1, 2022 | | CodeCode Available | 5 | 5 |
| Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine | Nov 28, 2023 | Electrical EngineeringExperimental Design | CodeCode Available | 5 | 5 |
| NotaGen: Advancing Musicality in Symbolic Music Generation with Large Language Model Training Paradigms | Feb 25, 2025 | Language ModelingLanguage Modelling | CodeCode Available | 5 | 5 |