SOTAVerified

Computational Efficiency

Methods and optimizations to reduce the computational resources (e.g., time, memory, or power) needed for training and inference in models. This involves techniques that streamline processing, optimize algorithms, or leverage hardware to enhance performance without compromising accuracy.

Papers

Showing 51100 of 4891 papers

TitleStatusHype
When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context TrainingCode3
Effects of charging and discharging capabilities on trade-offs between model accuracy and computational efficiency in pumped thermal electricity storageCode3
FiTv2: Scalable and Improved Flexible Vision Transformer for Diffusion ModelCode3
Residual Kolmogorov-Arnold Network for Enhanced Deep LearningCode3
SOAP: Improving and Stabilizing Shampoo using AdamCode3
Apollo: Band-sequence Modeling for High-Quality Audio RestorationCode3
LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge DistillationCode3
GSFusion: Online RGB-D Mapping Where Gaussian Splatting Meets TSDF FusionCode3
FlashGS: Efficient 3D Gaussian Splatting for Large-scale and High-resolution RenderingCode3
Human-like Episodic Memory for Infinite Context LLMsCode3
Consistency Models Made EasyCode3
VoCo-LLaMA: Towards Vision Compression with Large Language ModelsCode3
HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation ModelCode3
vHeat: Building Vision Models upon Heat ConductionCode3
A Foundation Model for the Earth SystemCode3
BeautyMap: Binary-Encoded Adaptable Ground Matrix for Dynamic Points Removal in Global MapsCode3
EMCAD: Efficient Multi-scale Convolutional Attention Decoding for Medical Image SegmentationCode3
RAG and RAU: A Survey on Retrieval-Augmented Language Model in Natural Language ProcessingCode3
Taming Diffusion Probabilistic Models for Character ControlCode3
TSLANet: Rethinking Transformers for Time Series Representation LearningCode3
Tensorized NeuroEvolution of Augmenting Topologies for GPU AccelerationCode3
Salience DETR: Enhancing Detection Transformer with Hierarchical Salience Filtering RefinementCode3
STG-Mamba: Spatial-Temporal Graph Learning via Selective State Space ModelCode3
Is Mamba Effective for Time Series Forecasting?Code3
TimeMachine: A Time Series is Worth 4 Mambas for Long-term ForecastingCode3
DUFOMap: Efficient Dynamic Awareness MappingCode3
MAPE-PPI: Towards Effective and Efficient Protein-Protein Interaction Prediction via Microenvironment-Aware Protein EmbeddingCode3
FiT: Flexible Vision Transformer for Diffusion ModelCode3
Graph-Mamba: Towards Long-Range Graph Sequence Modeling with Selective State SpacesCode3
TinyGPT-V: Efficient Multimodal Large Language Model via Small BackbonesCode3
ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem SolvingCode3
I^2-World: Intra-Inter Tokenization for Efficient Dynamic 4D Scene ForecastingCode2
VolumetricSMPL: A Neural Volumetric Body Model for Efficient Interactions, Contacts, and CollisionsCode2
StreamSplat: Towards Online Dynamic 3D Reconstruction from Uncalibrated Video StreamsCode2
Geometry Aware Operator Transformer as an Efficient and Accurate Neural Surrogate for PDEs on Arbitrary DomainsCode2
MonoSplat: Generalizable 3D Gaussian Splatting from Monocular Depth Foundation ModelsCode2
Hybrid 3D-4D Gaussian Splatting for Fast Dynamic Scene RepresentationCode2
Miipher-2: A Universal Speech Restoration Model for Million-Hour Scale Data RestorationCode2
Towards Practical Second-Order Optimizers in Deep Learning: Insights from Fisher Information AnalysisCode2
GotenNet: Rethinking Efficient 3D Equivariant Graph Neural NetworksCode2
Retrieval Augmented Generation Evaluation in the Era of Large Language Models: A Comprehensive SurveyCode2
CLIP-Powered Domain Generalization and Domain Adaptation: A Comprehensive SurveyCode2
Enhancing Autonomous Driving Systems with On-Board Deployed Large Language ModelsCode2
InteractRank: Personalized Web-Scale Search Pre-Ranking with Cross Interaction FeaturesCode2
RWKVTTS: Yet another TTS based on RWKV-7Code2
Scaling Video-Language Models to 10K Frames via Hierarchical Differential DistillationCode2
Re-thinking Temporal Search for Long-Form Video UnderstandingCode2
LandMarkSystem Technical ReportCode2
SuperFlow++: Enhanced Spatiotemporal Consistency for Cross-Modal Data PretrainingCode2
BitDecoding: Unlocking Tensor Cores for Long-Context LLMs Decoding with Low-Bit KV CacheCode2
Show:102550
← PrevPage 2 of 98Next →

Benchmark Results

#ModelMetricClaimedVerifiedStatus
1ViTaLHamming Loss0.05Unverified