SOTAVerified

GPU

Papers

Showing 201250 of 5629 papers

TitleStatusHype
ProDiff: Progressive Fast Diffusion Model For High-Quality Text-to-SpeechCode3
Fine-Tuning Language Models with Just Forward PassesCode3
MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile DevicesCode3
Modular Duality in Deep LearningCode3
94% on CIFAR-10 in 3.29 Seconds on a Single GPUCode3
MobileMamba: Lightweight Multi-Receptive Visual Mamba NetworkCode3
MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of ExpertsCode3
mlpack 3: a fast, flexible machine learning libraryCode3
BiLLM: Pushing the Limit of Post-Training Quantization for LLMsCode3
ASE: Large-Scale Reusable Adversarial Skill Embeddings for Physically Simulated CharactersCode3
EfficientQAT: Efficient Quantization-Aware Training for Large Language ModelsCode3
MoE-Infinity: Efficient MoE Inference on Personal Machines with Sparsity-Aware Expert CacheCode3
BitDelta: Your Fine-Tune May Only Be Worth One BitCode3
Arctic Inference with Shift Parallelism: Fast and Efficient Open Source Inference System for Enterprise AICode3
Arctic Long Sequence Training: Scalable And Efficient Training For Multi-Million Token SequencesCode3
Efficient and Generalizable Speaker Diarization via Structured Pruning of Self-Supervised ModelsCode3
MetaDE: Evolving Differential Evolution by Differential EvolutionCode3
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-DesignCode3
M+: Extending MemoryLLM with Scalable Long-Term MemoryCode3
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive LossCode3
MegaBlocks: Efficient Sparse Training with Mixture-of-ExpertsCode3
Fast-MD: Fast Multi-Decoder End-to-End Speech Translation with Non-Autoregressive Hidden IntermediatesCode3
Merlin: A Vision Language Foundation Model for 3D Computed TomographyCode3
Machine Learning in Python: Main developments and technology trends in data science, machine learning, and artificial intelligenceCode3
S-LoRA: Serving Thousands of Concurrent LoRA AdaptersCode3
MagicPIG: LSH Sampling for Efficient LLM GenerationCode3
APOLLO: SGD-like Memory, AdamW-level PerformanceCode3
MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video UnderstandingCode3
Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context AccurayCode3
LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at ScaleCode3
LiteGS: A High-Performance Modular Framework for Gaussian Splatting TrainingCode3
LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via a Hybrid ArchitectureCode3
GPU-accelerated Evolutionary Multiobjective Optimization Using Tensorized RVEACode3
GPU-accelerated Evolutionary Many-objective Optimization Using Tensorized NSGA-IIICode3
Graph-Reward-SQL: Execution-Free Reinforcement Learning for Text-to-SQL via Graph Matching and Stepwise RewardCode3
Andes: Defining and Enhancing Quality-of-Experience in LLM-Based Text Streaming ServicesCode3
CtrLoRA: An Extensible and Efficient Framework for Controllable Image GenerationCode3
Graph-Mamba: Towards Long-Range Graph Sequence Modeling with Selective State SpacesCode3
Data Generation for Hardware-Friendly Post-Training QuantizationCode3
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language ModelsCode3
The Mamba in the Llama: Distilling and Accelerating Hybrid ModelsCode3
KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache QuantizationCode3
Cramming: Training a Language Model on a Single GPU in One DayCode3
Dataset Distillation with Neural Characteristic Function: A Minmax PerspectiveCode3
TorchCP: A Python Library for Conformal PredictionCode3
CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers UpCode3
AdaRevD: Adaptive Patch Exiting Reversible Decoder Pushes the Limit of Image DeblurringCode3
LayerKV: Optimizing Large Language Model Serving with Layer-wise KV Cache ManagementCode3
How Well Do Supervised 3D Models Transfer to Medical Imaging Tasks?Code3
LinFusion: 1 GPU, 1 Minute, 16K ImageCode3
Show:102550
← PrevPage 5 of 113Next →

No leaderboard results yet.