SOTAVerified|Agents Browse Leaderboard About

GPU

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 51–75 of 5629 papers

Title	Date	Tasks	Status	Hype	Score
YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications	Sep 7, 2022	GPUObject Detection	CodeCode Available	5	5
FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning	Feb 29, 2024	GPULanguage Modeling	CodeCode Available	5	5
Fast On-device LLM Inference with NPUs	Jul 8, 2024	CPUGPU	CodeCode Available	5	5
ReLoRA: High-Rank Training Through Low-Rank Updates	Jul 11, 2023	GPU	CodeCode Available	5	5
Extreme Compression of Large Language Models via Additive Quantization	Jan 11, 2024	CPUGPU	CodeCode Available	5	5
Representing Long Volumetric Video with Temporal Gaussian Hierarchy	Dec 12, 2024	GPU	CodeCode Available	5	5
DEIM: DETR with Improved Matching for Fast Convergence	Dec 5, 2024	Data AugmentationGPU	CodeCode Available	5	5
FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU	Mar 13, 2023	CPUGPU	CodeCode Available	5	5
AudioLCM: Text-to-Audio Generation with Latent Consistency Models	Jun 1, 2024	Audio GenerationAudio Synthesis	CodeCode Available	5	5
Point-E: A System for Generating 3D Point Clouds from Complex Prompts	Dec 16, 2022	Generating 3D Point CloudsGPU	CodeCode Available	5	5
PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU	Dec 16, 2023	CPUGPU	CodeCode Available	5	5
Group-in-Group Policy Optimization for LLM Agent Training	May 16, 2025	GPUMathematical Reasoning	CodeCode Available	5	5
FLUX: Fast Software-based Communication Overlap On GPUs Through Kernel Fusion	Jun 11, 2024	GPU	CodeCode Available	5	5
Deep Lake: a Lakehouse for Deep Learning	Sep 22, 2022	Decision MakingDeep Learning	CodeCode Available	5	5
Orbit: A Unified Simulation Framework for Interactive Robot Learning Environments	Jan 10, 2023	GPUImitation Learning	CodeCode Available	5	5
FlashAudio: Rectified Flows for Fast and High-Fidelity Text-to-Audio Generation	Oct 16, 2024	Audio GenerationGPU	CodeCode Available	5	5
TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second	Jul 5, 2022	AutoMLBayesian Inference	CodeCode Available	5	5
Comet: Fine-grained Computation-communication Overlapping for Mixture-of-Experts	Feb 27, 2025	Computational EfficiencyGPU	CodeCode Available	5	5
EfficientRep:An Efficient Repvgg-style ConvNets with Hardware-aware Neural Network Design	Feb 1, 2023	GPUobject-detection	CodeCode Available	5	5
KBLaM: Knowledge Base augmented Language Model	Oct 14, 2024	8kGPU	CodeCode Available	5	5
MARLIN: Mixed-Precision Auto-Regressive Parallel Inference on Large Language Models	Aug 21, 2024	GPUQuantization	CodeCode Available	5	5
MMInference: Accelerating Pre-filling for Long-Context VLMs via Modality-Aware Permutation Sparse Attention	Apr 22, 2025	GPU	CodeCode Available	5	5
Deep Patch Visual SLAM	Aug 3, 2024	GPUVisual Odometry	CodeCode Available	4	5
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token	Jan 7, 2025	GPUVisual Question Answering (VQA)	CodeCode Available	4	5
Building reliable sim driving agents by scaling self-play	Feb 20, 2025	Autonomous VehiclesBenchmarking	CodeCode Available	4	5

Show:10 25 50

← PrevPage 3 of 226Next →

No leaderboard results yet.