SOTAVerified

GPU

Papers

Showing 226250 of 5629 papers

TitleStatusHype
BiLLM: Pushing the Limit of Post-Training Quantization for LLMsCode3
Graph-Mamba: Towards Long-Range Graph Sequence Modeling with Selective State SpacesCode3
KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache QuantizationCode3
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-DesignCode3
MoE-Infinity: Efficient MoE Inference on Personal Machines with Sparsity-Aware Expert CacheCode3
Inferflow: an Efficient and Highly Configurable Inference Engine for Large Language ModelsCode3
RoSA: Accurate Parameter-Efficient Fine-Tuning via Robust AdaptationCode3
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language ModelsCode3
MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile DevicesCode3
XuanCe: A Comprehensive and Unified Deep Reinforcement Learning LibraryCode3
Splatter Image: Ultra-Fast Single-View 3D ReconstructionCode3
S-LoRA: Serving Thousands of Concurrent LoRA AdaptersCode3
Punica: Multi-Tenant LoRA ServingCode3
TorchSparse++: Efficient Training and Inference Framework for Sparse Convolution on GPUsCode3
Take the aTrain. Introducing an Interface for the Accessible Transcription of InterviewsCode3
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video GenerationCode3
InstaFlow: One Step is Enough for High-Quality Diffusion-Based Text-to-Image GenerationCode3
nanoT5: A PyTorch Framework for Pre-training and Fine-tuning T5-style Models with Limited ResourcesCode3
Retentive Network: A Successor to Transformer for Large Language ModelsCode3
TAPIR: Tracking Any Point with per-frame Initialization and temporal RefinementCode3
Fine-Tuning Language Models with Just Forward PassesCode3
Unlimiformer: Long-Range Transformers with Unlimited Length InputCode3
TorchBench: Benchmarking PyTorch with High API Surface CoverageCode3
FastViT: A Fast Hybrid Vision Transformer using Structural ReparameterizationCode3
EvoTorch: Scalable Evolutionary Computation in PythonCode3
Show:102550
← PrevPage 10 of 226Next →

No leaderboard results yet.