SOTAVerified

GPU

Papers

Showing 176200 of 5629 papers

TitleStatusHype
EfficientQAT: Efficient Quantization-Aware Training for Large Language ModelsCode3
Arctic Inference with Shift Parallelism: Fast and Efficient Open Source Inference System for Enterprise AICode3
MetaDE: Evolving Differential Evolution by Differential EvolutionCode3
Efficient and Generalizable Speaker Diarization via Structured Pruning of Self-Supervised ModelsCode3
94% on CIFAR-10 in 3.29 Seconds on a Single GPUCode3
Machine Learning in Python: Main developments and technology trends in data science, machine learning, and artificial intelligenceCode3
MagicPIG: LSH Sampling for Efficient LLM GenerationCode3
APOLLO: SGD-like Memory, AdamW-level PerformanceCode3
MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video UnderstandingCode3
MegaBlocks: Efficient Sparse Training with Mixture-of-ExpertsCode3
mlpack 3: a fast, flexible machine learning libraryCode3
LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via a Hybrid ArchitectureCode3
MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile DevicesCode3
LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at ScaleCode3
LiteGS: A High-Performance Modular Framework for Gaussian Splatting TrainingCode3
Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context AccurayCode3
Merlin: A Vision Language Foundation Model for 3D Computed TomographyCode3
MotionFollower: Editing Video Motion via Lightweight Score-Guided DiffusionCode3
LayerKV: Optimizing Large Language Model Serving with Layer-wise KV Cache ManagementCode3
Andes: Defining and Enhancing Quality-of-Experience in LLM-Based Text Streaming ServicesCode3
KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache QuantizationCode3
InstanSeg: an embedding-based instance segmentation algorithm optimized for accurate, efficient and portable cell segmentationCode3
nanoT5: A PyTorch Framework for Pre-training and Fine-tuning T5-style Models with Limited ResourcesCode3
CtrLoRA: An Extensible and Efficient Framework for Controllable Image GenerationCode3
Data Generation for Hardware-Friendly Post-Training QuantizationCode3
Show:102550
← PrevPage 8 of 226Next →

No leaderboard results yet.