SOTAVerified|Agents Browse Leaderboard About Blog

GPU

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–75 of 5629 papers

Title	Date	Tasks	Status	Hype
XGrammar: Flexible and Efficient Structured Generation Engine for Large Language Models	Nov 22, 2024	GPU	CodeCode Available	5
FlashAudio: Rectified Flows for Fast and High-Fidelity Text-to-Audio Generation	Oct 16, 2024	Audio GenerationGPU	CodeCode Available	5
KBLaM: Knowledge Base augmented Language Model	Oct 14, 2024	8kGPU	CodeCode Available	5
MARLIN: Mixed-Precision Auto-Regressive Parallel Inference on Large Language Models	Aug 21, 2024	GPUQuantization	CodeCode Available	5
Fast On-device LLM Inference with NPUs	Jul 8, 2024	CPUGPU	CodeCode Available	5
FLUX: Fast Software-based Communication Overlap On GPUs Through Kernel Fusion	Jun 11, 2024	GPU	CodeCode Available	5
AudioLCM: Text-to-Audio Generation with Latent Consistency Models	Jun 1, 2024	Audio GenerationAudio Synthesis	CodeCode Available	5
FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning	Feb 29, 2024	GPULanguage Modeling	CodeCode Available	5
Extreme Compression of Large Language Models via Additive Quantization	Jan 11, 2024	CPUGPU	CodeCode Available	5
PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU	Dec 16, 2023	CPUGPU	CodeCode Available	5
LongQLoRA: Efficient and Effective Method to Extend Context Length of Large Language Models	Nov 8, 2023	8kGPU	CodeCode Available	5
ReLoRA: High-Rank Training Through Low-Rank Updates	Jul 11, 2023	GPU	CodeCode Available	5
Faster Segment Anything: Towards Lightweight SAM for Mobile Applications	Jun 25, 2023	CPUDecoder	CodeCode Available	5
FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU	Mar 13, 2023	CPUGPU	CodeCode Available	5
EfficientRep:An Efficient Repvgg-style ConvNets with Hardware-aware Neural Network Design	Feb 1, 2023	GPUobject-detection	CodeCode Available	5
YOLOv6 v3.0: A Full-Scale Reloading	Jan 13, 2023	GPUObject Detection	CodeCode Available	5
Orbit: A Unified Simulation Framework for Interactive Robot Learning Environments	Jan 10, 2023	GPUImitation Learning	CodeCode Available	5
Point-E: A System for Generating 3D Point Clouds from Complex Prompts	Dec 16, 2022	Generating 3D Point CloudsGPU	CodeCode Available	5
Deep Lake: a Lakehouse for Deep Learning	Sep 22, 2022	Decision MakingDeep Learning	CodeCode Available	5
YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications	Sep 7, 2022	GPUObject Detection	CodeCode Available	5
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale	Aug 15, 2022	GPULanguage Modelling	CodeCode Available	5
TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second	Jul 5, 2022	AutoMLBayesian Inference	CodeCode Available	5
Multi-head Temporal Latent Attention	May 19, 2025	GPUspeech-recognition	CodeCode Available	4
Accelerating Visual-Policy Learning through Parallel Differentiable Simulation	May 15, 2025	GPU	CodeCode Available	4
OnPrem.LLM: A Privacy-Conscious Document Intelligence Toolkit	May 12, 2025	GPUPrivacy Preserving	CodeCode Available	4

Show:10 25 50

← PrevPage 3 of 226Next →

No leaderboard results yet.