SOTAVerified

GPU

Papers

Showing 5175 of 5629 papers

TitleStatusHype
YOLOv6: A Single-Stage Object Detection Framework for Industrial ApplicationsCode5
FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient FinetuningCode5
Fast On-device LLM Inference with NPUsCode5
ReLoRA: High-Rank Training Through Low-Rank UpdatesCode5
Extreme Compression of Large Language Models via Additive QuantizationCode5
Representing Long Volumetric Video with Temporal Gaussian HierarchyCode5
DEIM: DETR with Improved Matching for Fast ConvergenceCode5
FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPUCode5
AudioLCM: Text-to-Audio Generation with Latent Consistency ModelsCode5
Point-E: A System for Generating 3D Point Clouds from Complex PromptsCode5
PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPUCode5
Group-in-Group Policy Optimization for LLM Agent TrainingCode5
FLUX: Fast Software-based Communication Overlap On GPUs Through Kernel FusionCode5
Deep Lake: a Lakehouse for Deep LearningCode5
Orbit: A Unified Simulation Framework for Interactive Robot Learning EnvironmentsCode5
FlashAudio: Rectified Flows for Fast and High-Fidelity Text-to-Audio GenerationCode5
TabPFN: A Transformer That Solves Small Tabular Classification Problems in a SecondCode5
Comet: Fine-grained Computation-communication Overlapping for Mixture-of-ExpertsCode5
EfficientRep:An Efficient Repvgg-style ConvNets with Hardware-aware Neural Network DesignCode5
KBLaM: Knowledge Base augmented Language ModelCode5
MARLIN: Mixed-Precision Auto-Regressive Parallel Inference on Large Language ModelsCode5
MMInference: Accelerating Pre-filling for Long-Context VLMs via Modality-Aware Permutation Sparse AttentionCode5
Deep Patch Visual SLAMCode4
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision TokenCode4
Building reliable sim driving agents by scaling self-playCode4
Show:102550
← PrevPage 3 of 226Next →

No leaderboard results yet.