SOTAVerified

GPU

Papers

Showing 201225 of 5629 papers

TitleStatusHype
Machine Learning in Python: Main developments and technology trends in data science, machine learning, and artificial intelligenceCode3
Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context AccurayCode3
MagicPIG: LSH Sampling for Efficient LLM GenerationCode3
MegaBlocks: Efficient Sparse Training with Mixture-of-ExpertsCode3
mlpack 3: a fast, flexible machine learning libraryCode3
LiteGS: A High-Performance Modular Framework for Gaussian Splatting TrainingCode3
LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at ScaleCode3
LinFusion: 1 GPU, 1 Minute, 16K ImageCode3
LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via a Hybrid ArchitectureCode3
LayerKV: Optimizing Large Language Model Serving with Layer-wise KV Cache ManagementCode3
Andes: Defining and Enhancing Quality-of-Experience in LLM-Based Text Streaming ServicesCode3
BiLLM: Pushing the Limit of Post-Training Quantization for LLMsCode3
Data Generation for Hardware-Friendly Post-Training QuantizationCode3
Dataset Distillation with Neural Characteristic Function: A Minmax PerspectiveCode3
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language ModelsCode3
InstanSeg: an embedding-based instance segmentation algorithm optimized for accurate, efficient and portable cell segmentationCode3
Allo: A Programming Model for Composable Accelerator DesignCode3
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-DesignCode3
Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement LearningCode3
Cramming: Training a Language Model on a Single GPU in One DayCode3
Retentive Network: A Successor to Transformer for Large Language ModelsCode3
Consistency Models Made EasyCode3
Inferflow: an Efficient and Highly Configurable Inference Engine for Large Language ModelsCode3
CtrLoRA: An Extensible and Efficient Framework for Controllable Image GenerationCode3
Inference Performance Optimization for Large Language Models on CPUsCode3
Show:102550
← PrevPage 9 of 226Next →

No leaderboard results yet.