SOTAVerified

4k

Papers

Showing 150 of 367 papers

TitleStatusHype
NeedleBench: Can LLMs Do Retrieval and Reasoning in Information-Dense Context?Code9
InternLM2 Technical ReportCode9
World Model on Million-Length Video And Language With Blockwise RingAttentionCode9
ComfyUI-R1: Exploring Reasoning Models for Workflow GenerationCode7
Scaling Vision Pre-Training to 4K ResolutionCode7
Hallo2: Long-Duration and High-Resolution Audio-Driven Portrait Image AnimationCode7
LongLoRA: Efficient Fine-tuning of Long-Context Large Language ModelsCode6
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-AwarenessCode6
Prompting Depth Anything for 4K Resolution Accurate Metric Depth EstimationCode5
Make Your LLM Fully Utilize the ContextCode5
NTIRE 2024 Challenge on Low Light Image Enhancement: Methods and ResultsCode5
PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image GenerationCode5
Scaling Granite Code Models to 128K ContextCode4
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language ModelingCode4
Highly Accurate Dichotomous Image SegmentationCode4
Ultra-High-Resolution Image Synthesis: Data, Method and EvaluationCode3
Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent Diffusion ModelsCode3
Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context AccurayCode3
LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMsCode3
PanSplat: 4K Panorama Synthesis with Feed-Forward Gaussian SplattingCode3
360Zhinao Technical ReportCode3
Real-Time 4K Super-Resolution of Compressed AVIF Images. AIS 2024 Challenge SurveyCode3
Data Engineering for Scaling Language Models to 128K ContextCode3
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-TrainingCode3
Robust High-Resolution Video Matting with Temporal GuidanceCode3
Real-Time High-Resolution Background MattingCode3
SeerAttention-R: Sparse Attention Adaptation for Long ReasoningCode2
Learning Adaptive Parallel Reasoning with Language ModelsCode2
Surg-3M: A Dataset and Foundation Model for Perception in Surgical SettingsCode2
MaSS13K: A Matting-level Semantic Segmentation BenchmarkCode2
Ultra-Resolution Adaptation with EaseCode2
DriveLMM-o1: A Step-by-Step Reasoning Dataset and Large Multimodal Model for Driving Scenario UnderstandingCode2
GeoPixel: Pixel Grounding Large Multimodal Model in Remote SensingCode2
CaPa: Carve-n-Paint Synthesis for Efficient 4K Textured Mesh GenerationCode2
MemLong: Memory-Augmented Retrieval for Long Text ModelingCode2
VFIMamba: Video Frame Interpolation with State Space ModelsCode2
UVEB: A Large-scale Benchmark and Baseline Towards Real-World Underwater Video EnhancementCode2
LongEmbed: Extending Embedding Models for Long Context RetrievalCode2
LLoCO: Learning Long Contexts OfflineCode2
Counting-Stars: A Multi-evidence, Position-aware, and Scalable Benchmark for Evaluating Long-Context Large Language ModelsCode2
Monarch Mixer: A Simple Sub-Quadratic GEMM-Based ArchitectureCode2
Giraffe: Adventures in Expanding Context Lengths in LLMsCode2
HyenaDNA: Long-Range Genomic Sequence Modeling at Single Nucleotide ResolutionCode2
Bicubic++: Slim, Slimmer, Slimmest -- Designing an Industry-Grade Super-Resolution NetworkCode2
Neural Preset for Color Style TransferCode2
Ultra-High-Definition Low-Light Image Enhancement: A Benchmark and Transformer-Based MethodCode2
4K-NeRF: High Fidelity Neural Radiance Fields at Ultra High ResolutionsCode2
Text2Light: Zero-Shot Text-Driven HDR Panorama GenerationCode2
VEViD: Vision Enhancement via Virtual diffraction and coherent DetectionCode2
BoW3D: Bag of Words for Real-Time Loop Closing in 3D LiDAR SLAMCode2
Show:102550
← PrevPage 1 of 8Next →

No leaderboard results yet.