| LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs | Jan 10, 2025 | 4kVisual Reasoning | CodeCode Available | 3 |
| Knowledge Distillation with Adapted Weight | Jan 6, 2025 | 4kFairness | —Unverified | 0 |
| PatchVSR: Breaking Video Diffusion Resolution Limits with Patch-wise Video Super-Resolution | Jan 1, 2025 | 4kSuper-Resolution | —Unverified | 0 |
| "ScatSpotter" 2024 -- A Distributed Dog Poop Detection Dataset | Dec 21, 2024 | 4k | —Unverified | 0 |
| Prompting Depth Anything for 4K Resolution Accurate Metric Depth Estimation | Dec 18, 2024 | 3D Reconstruction4k | CodeCode Available | 5 |
| Turbo-GS: Accelerating 3D Gaussian Fitting for High-Quality Radiance Fields | Dec 18, 2024 | 3DGS3D Reconstruction | —Unverified | 0 |
| Real-time Free-view Human Rendering from Sparse-view RGB Videos using Double Unprojected Textures | Dec 17, 2024 | 4k | —Unverified | 0 |
| Block-Based Multi-Scale Image Rescaling | Dec 16, 2024 | 2k4k | —Unverified | 0 |
| PanSplat: 4K Panorama Synthesis with Feed-Forward Gaussian Splatting | Dec 16, 2024 | 3D Reconstruction4k | CodeCode Available | 3 |
| Lexico: Extreme KV Cache Compression via Sparse Coding over Universal Dictionaries | Dec 12, 2024 | 4kGSM8K | CodeCode Available | 1 |
| Reversing the Damage: A QP-Aware Transformer-Diffusion Approach for 8K Video Restoration under Codec Compression | Dec 12, 2024 | 4k8k | CodeCode Available | 1 |
| RTSR: A Real-Time Super-Resolution Model for AV1 Compressed Content | Nov 20, 2024 | 4kKnowledge Distillation | —Unverified | 0 |
| RadPhi-3: Small Language Models for Radiology | Nov 19, 2024 | 4kLanguage Modeling | —Unverified | 0 |
| Zoomed In, Diffused Out: Towards Local Degradation-Aware Multi-Diffusion for Extreme Image Super-Resolution | Nov 18, 2024 | 2k4k | CodeCode Available | 0 |
| Additional Tests for TV 3.0 | Nov 18, 2024 | 4k | —Unverified | 0 |
| TSFormer: A Robust Framework for Efficient UHD Image Restoration | Nov 17, 2024 | 4kComputational Efficiency | —Unverified | 0 |
| Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models | Nov 11, 2024 | 4kImage Generation | —Unverified | 0 |
| Advanced computer vision for extracting georeferenced vehicle trajectories from drone imagery | Nov 4, 2024 | 4kgeo-localization | CodeCode Available | 1 |
| MPDS: A Movie Posters Dataset for Image Generation with Diffusion Model | Oct 22, 2024 | 4k8k | —Unverified | 0 |
| Bias Similarity Across Large Language Models | Oct 15, 2024 | 4kFairness | —Unverified | 0 |
| Hallo2: Long-Duration and High-Resolution Audio-Driven Portrait Image Animation | Oct 10, 2024 | 4kImage Animation | CodeCode Available | 7 |
| A Little Goes a Long Way: Efficient Long Context Training and Inference with Partial Contexts | Oct 2, 2024 | 4kGPU | —Unverified | 0 |
| On The Adaptation of Unlimiformer for Decoder-Only Transformers | Oct 2, 2024 | 4k8k | —Unverified | 0 |
| Study of Subjective and Objective Quality in Super-Resolution Enhanced Broadcast Images on a Novel SR-IQA Dataset | Sep 26, 2024 | 2k4k | —Unverified | 0 |
| AIM 2024 Challenge on Efficient Video Super-Resolution for AV1 Compressed Content | Sep 25, 2024 | 4kSuper-Resolution | —Unverified | 0 |
| AIM 2024 Challenge on UHD Blind Photo Quality Assessment | Sep 24, 2024 | 4kComputational Efficiency | CodeCode Available | 1 |
| On the Effectiveness of LLMs for Manual Test Verifications | Sep 19, 2024 | 4ksoftware testing | —Unverified | 0 |
| Enhancing Decision-Making for LLM Agents via Step-Level Q-Value Models | Sep 14, 2024 | 4kDecision Making | —Unverified | 0 |
| USTC-TD: A Test Dataset and Benchmark for Image and Video Coding in 2020s | Sep 13, 2024 | 4kMS-SSIM | —Unverified | 0 |
| Hybrid Cost Volume for Memory-Efficient Optical Flow | Sep 6, 2024 | 4kOptical Flow Estimation | CodeCode Available | 1 |
| HiPrompt: Tuning-free Higher-Resolution Generation with Hierarchical MLLM Prompts | Sep 4, 2024 | 4kDenoising | CodeCode Available | 1 |
| CyberCortex.AI: An AI-based Operating System for Autonomous Robotics and Complex Automation | Sep 2, 2024 | 4kAutonomous Driving | —Unverified | 0 |
| Assessing UHD Image Quality from Aesthetics, Distortions, and Saliency | Sep 1, 2024 | 4kImage Quality Assessment | CodeCode Available | 1 |
| MemLong: Memory-Augmented Retrieval for Long Text Modeling | Aug 30, 2024 | 4kDecoder | CodeCode Available | 2 |
| Divide, Conquer and Combine: A Training-Free Framework for High-Resolution Image Perception in Multimodal Large Language Models | Aug 28, 2024 | 2k4k | CodeCode Available | 1 |
| Advanced atom-level representations for protein flexibility prediction utilizing graph neural networks | Aug 22, 2024 | 4kRepresentation Learning | —Unverified | 0 |
| Video-to-Text Pedestrian Monitoring (VTPM): Leveraging Computer Vision and Large Language Models for Privacy-Preserve Pedestrian Activity Monitoring at Intersections | Aug 21, 2024 | 4kPedestrian Detection | —Unverified | 0 |
| MobileMEF: Fast and Efficient Method for Multi-Exposure Fusion | Aug 15, 2024 | 4kComputational Efficiency | CodeCode Available | 1 |
| What should I wear to a party in a Greek taverna? Evaluation for Conversational Agents in the Fashion Domain | Aug 13, 2024 | 4k | —Unverified | 0 |
| A complete characterization of pairs of binary phylogenetic trees with identical A_k-alignments | Aug 13, 2024 | 2k4k | —Unverified | 0 |
| Review Learning: Advancing All-in-One Ultra-High-Definition Image Restoration Training Method | Aug 13, 2024 | 4kAll | —Unverified | 0 |
| PGNeXt: High-Resolution Salient Object Detection via Pyramid Grafting Network | Aug 2, 2024 | 4k8k | —Unverified | 0 |
| Highly Efficient No-reference 4K Video Quality Assessment with Full-Pixel Covering Sampling and Training Strategy | Jul 30, 2024 | 4kVideo Quality Assessment | —Unverified | 0 |
| ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities | Jul 19, 2024 | 4k8k | —Unverified | 0 |
| Scaling Granite Code Models to 128K Context | Jul 18, 2024 | 2k4k | CodeCode Available | 4 |
| NeedleBench: Can LLMs Do Retrieval and Reasoning in Information-Dense Context? | Jul 16, 2024 | 4k8k | CodeCode Available | 9 |
| Uncovering Semantics and Topics Utilized by Threat Actors to Deliver Malicious Attachments and URLs | Jul 11, 2024 | 4kClustering | —Unverified | 0 |
| HoloHisto: End-to-end Gigapixel WSI Segmentation with 4K Resolution Sequential Tokenization | Jul 3, 2024 | 4kImage Segmentation | —Unverified | 0 |
| Meta 3D TextureGen: Fast and Consistent Texture Generation for 3D Objects | Jul 2, 2024 | 4kTexture Synthesis | —Unverified | 0 |
| VFIMamba: Video Frame Interpolation with State Space Models | Jul 2, 2024 | 2k4k | CodeCode Available | 2 |