SOTAVerified

Computational Efficiency

Methods and optimizations to reduce the computational resources (e.g., time, memory, or power) needed for training and inference in models. This involves techniques that streamline processing, optimize algorithms, or leverage hardware to enhance performance without compromising accuracy.

Papers

Showing 5175 of 4891 papers

TitleStatusHype
When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context TrainingCode3
Effects of charging and discharging capabilities on trade-offs between model accuracy and computational efficiency in pumped thermal electricity storageCode3
FiTv2: Scalable and Improved Flexible Vision Transformer for Diffusion ModelCode3
Residual Kolmogorov-Arnold Network for Enhanced Deep LearningCode3
SOAP: Improving and Stabilizing Shampoo using AdamCode3
Apollo: Band-sequence Modeling for High-Quality Audio RestorationCode3
LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge DistillationCode3
GSFusion: Online RGB-D Mapping Where Gaussian Splatting Meets TSDF FusionCode3
FlashGS: Efficient 3D Gaussian Splatting for Large-scale and High-resolution RenderingCode3
Human-like Episodic Memory for Infinite Context LLMsCode3
Consistency Models Made EasyCode3
VoCo-LLaMA: Towards Vision Compression with Large Language ModelsCode3
HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation ModelCode3
vHeat: Building Vision Models upon Heat ConductionCode3
A Foundation Model for the Earth SystemCode3
BeautyMap: Binary-Encoded Adaptable Ground Matrix for Dynamic Points Removal in Global MapsCode3
EMCAD: Efficient Multi-scale Convolutional Attention Decoding for Medical Image SegmentationCode3
RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural Language ProcessingCode3
Taming Diffusion Probabilistic Models for Character ControlCode3
TSLANet: Rethinking Transformers for Time Series Representation LearningCode3
Tensorized NeuroEvolution of Augmenting Topologies for GPU AccelerationCode3
Salience DETR: Enhancing Detection Transformer with Hierarchical Salience Filtering RefinementCode3
STG-Mamba: Spatial-Temporal Graph Learning via Selective State Space ModelCode3
Is Mamba Effective for Time Series Forecasting?Code3
TimeMachine: A Time Series is Worth 4 Mambas for Long-term ForecastingCode3
Show:102550
← PrevPage 3 of 196Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ViTaLHamming Loss0.05Unverified