SOTAVerified|Agents Browse Leaderboard About

GPU

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 41–50 of 5629 papers

Title	Date	Tasks	Status	Hype	Score
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness	May 27, 2022	16k4k	CodeCode Available	6	5
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models	Sep 21, 2023	4kGPU	CodeCode Available	6	5
SqueezeLLM: Dense-and-Sparse Quantization	Jun 13, 2023	GPUQuantization	CodeCode Available	6	5
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning	Jul 17, 2023	GPULanguage Modeling	CodeCode Available	6	5
QLoRA: Efficient Finetuning of Quantized LLMs	May 23, 2023	ChatbotGPU	CodeCode Available	6	5
AudioLCM: Text-to-Audio Generation with Latent Consistency Models	Jun 1, 2024	Audio GenerationAudio Synthesis	CodeCode Available	5	5
MMInference: Accelerating Pre-filling for Long-Context VLMs via Modality-Aware Permutation Sparse Attention	Apr 22, 2025	GPU	CodeCode Available	5	5
MARLIN: Mixed-Precision Auto-Regressive Parallel Inference on Large Language Models	Aug 21, 2024	GPUQuantization	CodeCode Available	5	5
DEIM: DETR with Improved Matching for Fast Convergence	Dec 5, 2024	Data AugmentationGPU	CodeCode Available	5	5
Deep Lake: a Lakehouse for Deep Learning	Sep 22, 2022	Decision MakingDeep Learning	CodeCode Available	5	5

Show:10 25 50

← PrevPage 5 of 563Next →

No leaderboard results yet.