SOTAVerified

Computational Efficiency

Methods and optimizations to reduce the computational resources (e.g., time, memory, or power) needed for training and inference in models. This involves techniques that streamline processing, optimize algorithms, or leverage hardware to enhance performance without compromising accuracy.

Papers

Showing 151160 of 4891 papers

TitleStatusHype
SparseLLM: Towards Global Pruning for Pre-trained Language ModelsCode2
Balancing LoRA Performance and Efficiency with Simple Shard SharingCode2
Agent Attention: On the Integration of Softmax and Linear AttentionCode2
GotenNet: Rethinking Efficient 3D Equivariant Graph Neural NetworksCode2
FuXi Weather: A data-to-forecast machine learning system for global weatherCode2
BitDecoding: Unlocking Tensor Cores for Long-Context LLMs Decoding with Low-Bit KV CacheCode2
Free Video-LLM: Prompt-guided Visual Perception for Efficient Training-free Video LLMsCode2
Harder Tasks Need More Experts: Dynamic Routing in MoE ModelsCode2
Generalized and Efficient 2D Gaussian Splatting for Arbitrary-scale Super-ResolutionCode2
Flow Matching in Latent SpaceCode2
Show:102550
← PrevPage 16 of 490Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ViTaLHamming Loss0.05Unverified