SOTAVerified

Inference Optimization

Papers

Showing 1120 of 56 papers

TitleStatusHype
Easy and Efficient Transformer : Scalable Inference Solution For large NLP modelCode1
Sub-MoE: Efficient Mixture-of-Expert LLMs Compression via Subspace Expert MergingCode0
The Foundation Cracks: A Comprehensive Study on Bugs and Testing Practices in LLM Libraries0
Brevity is the soul of sustainability: Characterizing LLM response lengthsCode0
DSMentor: Enhancing Data Science Agents with Curriculum Learning and Online Knowledge Accumulation0
Faster MoE LLM Inference for Extremely Large Models0
Energy-Efficient Transformer Inference: Optimization Strategies for Time Series Classification0
Hybrid Offline-online Scheduling Method for Large Language Model Inference Optimization0
DVFS-Aware DNN Inference on GPUs: Latency Modeling and Performance Analysis0
Hellinger-Kantorovich Gradient Flows: Global Exponential Decay of Entropy Functionals0
Show:102550
← PrevPage 2 of 6Next →

No leaderboard results yet.