| Dynamic Parameter Memory: Temporary LoRA-Enhanced LLM for Long-Sequence Emotion Recognition in Conversation | Jul 11, 2025 | 4kEmotion Recognition | CodeCode Available | 0 |
| 4KAgent: Agentic Any Image to 4K Super-Resolution | Jul 9, 2025 | 4kImage Quality Assessment | —Unverified | 0 |
| AUTOMATIC ROOM LIGHT CONTROLLER MANAGEMENT SYSTEM. | Jun 25, 2025 | 4kCPU | —Unverified | 0 |
| UltraVideo: High-Quality UHD Video Dataset with Comprehensive Captions | Jun 16, 2025 | 4k8k | —Unverified | 0 |
| ComfyUI-R1: Exploring Reasoning Models for Workflow Generation | Jun 11, 2025 | 4k | CodeCode Available | 7 |
| TransXSSM: A Hybrid Transformer State Space Model with Unified Rotary Position Embedding | Jun 11, 2025 | 4kLanguage Modeling | —Unverified | 0 |
| SeerAttention-R: Sparse Attention Adaptation for Long Reasoning | Jun 10, 2025 | 4kGPU | CodeCode Available | 2 |
| Unfolding Spatial Cognition: Evaluating Multimodal Models on Visual Simulations | Jun 5, 2025 | 4kSpatial Reasoning | CodeCode Available | 1 |
| Ultra-High-Resolution Image Synthesis: Data, Method and Evaluation | Jun 2, 2025 | 4kDescriptive | CodeCode Available | 3 |
| GThinker: Towards General Multimodal Reasoning via Cue-Guided Rethinking | Jun 1, 2025 | 4kMath | CodeCode Available | 0 |
| Latent Wavelet Diffusion: Enabling 4K Image Synthesis for Free | May 31, 2025 | 2k4k | —Unverified | 0 |
| Control-R: Towards controllable test-time scaling | May 30, 2025 | 4k | —Unverified | 0 |
| Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL for Large Language Models | May 29, 2025 | 2k4k | CodeCode Available | 1 |
| LoLA: Low-Rank Linear Attention With Sparse Caching | May 29, 2025 | 4k8k | —Unverified | 0 |
| MonarchAttention: Zero-Shot Conversion to Fast, Hardware-Aware Structured Attention | May 24, 2025 | 16k4k | CodeCode Available | 1 |
| QwenLong-CPRS: Towards -LLMs with Dynamic Context Optimization | May 23, 2025 | 4kLanguage Modeling | —Unverified | 0 |
| VeriFastScore: Speeding up long-form factuality evaluation | May 22, 2025 | 4kForm | CodeCode Available | 0 |
| UNCLE: Uncertainty Expressions in Long-Form Generation | May 22, 2025 | 4kForm | —Unverified | 0 |
| Chain-of-Focus: Adaptive Visual Search and Zooming for Multimodal Reasoning via RL | May 21, 2025 | 4kMultimodal Reasoning | —Unverified | 0 |
| UHD Image Dehazing via anDehazeFormer with Atmospheric-aware KV Cache | May 20, 2025 | 4k8k | —Unverified | 0 |
| Analog Foundation Models | May 14, 2025 | 4kQuantization | CodeCode Available | 1 |
| Leveraging Vision-Language Models for Visual Grounding and Analysis of Automotive UI | May 9, 2025 | 4kDomain Generalization | CodeCode Available | 0 |
| TeGA: Texture Space Gaussian Avatars for High-Resolution Dynamic Head Modeling | May 8, 2025 | 4kMotion Estimation | —Unverified | 0 |
| EntroLLM: Entropy Encoded Weight Compression for Efficient Large Language Model Inference on Edge Devices | May 5, 2025 | 4kLanguage Modeling | —Unverified | 0 |
| Learning Adaptive Parallel Reasoning with Language Models | Apr 21, 2025 | 4k | CodeCode Available | 2 |
| Distribution-aware Dataset Distillation for Efficient Image Restoration | Apr 21, 2025 | 4kDataset Distillation | —Unverified | 0 |
| Exploring Generalizable Pre-training for Real-world Change Detection via Geometric Estimation | Apr 19, 2025 | 4kChange Detection | —Unverified | 0 |
| WildLive: Near Real-time Visual Wildlife Tracking onboard UAVs | Apr 14, 2025 | 4kObject Tracking | —Unverified | 0 |
| Evaluation of the phi-3-mini SLM for identification of texts related to medicine, health, and sports injuries | Mar 31, 2025 | 4kMedQA | —Unverified | 0 |
| LENVIZ: A High-Resolution Low-Exposure Night Vision Benchmark Dataset | Mar 25, 2025 | 4kAutonomous Driving | —Unverified | 0 |
| Scaling Vision Pre-Training to 4K Resolution | Mar 25, 2025 | 4kContrastive Learning | CodeCode Available | 7 |
| Surg-3M: A Dataset and Foundation Model for Perception in Surgical Settings | Mar 25, 2025 | 4kAction Recognition | CodeCode Available | 2 |
| Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion Models | Mar 24, 2025 | 4kImage Generation | CodeCode Available | 3 |
| MaSS13K: A Matting-level Semantic Segmentation Benchmark | Mar 24, 2025 | 4kImage Matting | CodeCode Available | 2 |
| KL3M Tokenizers: A Family of Domain-Specific and Character-Level Tokenizers for Legal, Financial, and Preprocessing Applications | Mar 21, 2025 | 16k4k | CodeCode Available | 0 |
| iFlame: Interleaving Full and Linear Attention for Efficient Mesh Generation | Mar 20, 2025 | 4kComputational Efficiency | —Unverified | 0 |
| Ultra-Resolution Adaptation with Ease | Mar 20, 2025 | 2k4k | CodeCode Available | 2 |
| GAEA: A Geolocation Aware Conversational Model | Mar 20, 2025 | 4kmodel | —Unverified | 0 |
| DriveLMM-o1: A Step-by-Step Reasoning Dataset and Large Multimodal Model for Driving Scenario Understanding | Mar 13, 2025 | 4kAutonomous Driving | CodeCode Available | 2 |
| Illuminating Darkness: Enhancing Real-world Low-light Scenes with Smartphone Images | Mar 10, 2025 | 4kBenchmarking | CodeCode Available | 1 |
| Evaluating the Suitability of Different Intraoral Scan Resolutions for Deep Learning-Based Tooth Segmentation | Feb 26, 2025 | 16k2k | —Unverified | 0 |
| Heterogeneous Multi-Agent Bandits with Parsimonious Hints | Feb 22, 2025 | 4kMulti-Armed Bandits | —Unverified | 0 |
| ParallelComp: Parallel Long-Context Compressor for Length Extrapolation | Feb 20, 2025 | 4k8k | —Unverified | 0 |
| CLOVER: A Test Case Generation Benchmark with Coverage, Long-Context, and Verification | Feb 12, 2025 | 16k4k | —Unverified | 0 |
| Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuray | Feb 7, 2025 | 4kGeneral Knowledge | CodeCode Available | 3 |
| Claim Extraction for Fact-Checking: Data, Models, and Automated Metrics | Feb 7, 2025 | 4kFact Checking | —Unverified | 0 |
| Directing Mamba to Complex Textures: An Efficient Texture-Aware State Space Model for Image Restoration | Jan 27, 2025 | 4k8k | —Unverified | 0 |
| From Informal to Formal -- Incorporating and Evaluating LLMs on Natural Language Requirements to Verifiable Formal Proofs | Jan 27, 2025 | 4kMathematical Reasoning | —Unverified | 0 |
| GeoPixel: Pixel Grounding Large Multimodal Model in Remote Sensing | Jan 23, 2025 | 4k | CodeCode Available | 2 |
| CaPa: Carve-n-Paint Synthesis for Efficient 4K Textured Mesh Generation | Jan 16, 2025 | 3D Generation4k | CodeCode Available | 2 |