| Scaling Vision Pre-Training to 4K Resolution | Mar 25, 2025 | 4kContrastive Learning | CodeCode Available | 7 |
| Surg-3M: A Dataset and Foundation Model for Perception in Surgical Settings | Mar 25, 2025 | 4kAction Recognition | CodeCode Available | 2 |
| Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion Models | Mar 24, 2025 | 4kImage Generation | CodeCode Available | 3 |
| MaSS13K: A Matting-level Semantic Segmentation Benchmark | Mar 24, 2025 | 4kImage Matting | CodeCode Available | 2 |
| KL3M Tokenizers: A Family of Domain-Specific and Character-Level Tokenizers for Legal, Financial, and Preprocessing Applications | Mar 21, 2025 | 16k4k | CodeCode Available | 0 |
| iFlame: Interleaving Full and Linear Attention for Efficient Mesh Generation | Mar 20, 2025 | 4kComputational Efficiency | —Unverified | 0 |
| Ultra-Resolution Adaptation with Ease | Mar 20, 2025 | 2k4k | CodeCode Available | 2 |
| GAEA: A Geolocation Aware Conversational Model | Mar 20, 2025 | 4kmodel | —Unverified | 0 |
| DriveLMM-o1: A Step-by-Step Reasoning Dataset and Large Multimodal Model for Driving Scenario Understanding | Mar 13, 2025 | 4kAutonomous Driving | CodeCode Available | 2 |
| Illuminating Darkness: Enhancing Real-world Low-light Scenes with Smartphone Images | Mar 10, 2025 | 4kBenchmarking | CodeCode Available | 1 |