| Long-context LLMs Struggle with Long In-context Learning | Apr 2, 2024 | 2kIn-Context Learning | CodeCode Available | 5 |
| Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos | Jan 7, 2025 | 2kLanguage Modeling | CodeCode Available | 5 |
| Scaling Granite Code Models to 128K Context | Jul 18, 2024 | 2k4k | CodeCode Available | 4 |
| CLAY: A Controllable Large-scale Generative Model for Creating High-quality 3D Assets | May 30, 2024 | 2k3D geometry | CodeCode Available | 4 |
| SARDet-100K: Towards Open-Source Benchmark and ToolKit for Large-Scale SAR Object Detection | Mar 11, 2024 | 2D Object Detection2k | CodeCode Available | 4 |
| MovieChat+: Question-aware Sparse Memory for Long Video Question Answering | Apr 26, 2024 | 2kQuestion Answering | CodeCode Available | 4 |
| Highly Accurate Dichotomous Image Segmentation | Mar 6, 2022 | 2k3D Reconstruction | CodeCode Available | 4 |
| MaskGWM: A Generalizable Driving World Model with Video Mask Reconstruction | Feb 17, 2025 | 2kAutonomous Driving | CodeCode Available | 3 |
| Lighting Every Darkness with 3DGS: Fast Training and Real-Time Rendering for HDR View Synthesis | Jun 10, 2024 | 2k3DGS | CodeCode Available | 3 |
| CAMixerSR: Only Details Need More "Attention" | Feb 29, 2024 | 2k8k | CodeCode Available | 3 |
| FlashDepth: Real-time Streaming Video Depth Estimation at 2K Resolution | Apr 9, 2025 | 2kDecision Making | CodeCode Available | 3 |
| 1.5-Pints Technical Report: Pretraining in Days, Not Months -- Your Language Model Thrives on Quality Data | Aug 7, 2024 | 16k2k | CodeCode Available | 3 |
| AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension | Feb 12, 2024 | 2kAutomatic Speech Recognition | CodeCode Available | 2 |
| MGVQ: Could VQ-VAE Beat VAE? A Generalizable Tokenizer with Multi-group Quantization | Jul 10, 2025 | 2kQuantization | CodeCode Available | 2 |
| Linear Attention Sequence Parallelism | Apr 3, 2024 | 2k | CodeCode Available | 2 |
| MGVQ: Could VQ-VAE Beat VAE? A Generalizable Tokenizer with Multi-group Quantization | Jul 14, 2025 | 2kImage Generation | CodeCode Available | 2 |
| Paint3D: Paint Anything 3D with Lighting-Less Texture Diffusion Models | Dec 21, 2023 | 2k | CodeCode Available | 2 |
| High-fidelity 3D Human Digitization from Single 2K Resolution Images | Mar 27, 2023 | 2k3D Human Reconstruction | CodeCode Available | 2 |
| 360MonoDepth: High-Resolution 360deg Monocular Depth Estimation | Jan 1, 2022 | 2kDepth Estimation | CodeCode Available | 2 |
| HHAvatar: Gaussian Head Avatar with Dynamic Hairs | Dec 5, 2023 | 2k | CodeCode Available | 2 |
| GPS-Gaussian: Generalizable Pixel-wise 3D Gaussian Splatting for Real-time Human Novel View Synthesis | Dec 4, 2023 | 2kDepth Estimation | CodeCode Available | 2 |
| HD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image Inpainting with Diffusion Models | Dec 21, 2023 | 2kImage Inpainting | CodeCode Available | 2 |
| FaceVerse: a Fine-grained and Detail-controllable 3D Face Morphable Model from a Hybrid Dataset | Mar 26, 2022 | 2k3D Face Reconstruction | CodeCode Available | 2 |
| FastVAR: Linear Visual Autoregressive Modeling via Cached Token Pruning | Mar 30, 2025 | 2kGPU | CodeCode Available | 2 |
| Any-resolution Training for High-resolution Image Synthesis | Apr 14, 2022 | 2kImage Generation | CodeCode Available | 2 |