SOTAVerified|Agents Browse Leaderboard About

GPU

Papers

Recently Added Most Hyped Most Active Needs Verification Most Verified

Showing 226–250 of 5629 papers

Title	Date	Tasks	Status	Hype
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs	Feb 6, 2024	BinarizationGPU	CodeCode Available	3
Graph-Mamba: Towards Long-Range Graph Sequence Modeling with Selective State Spaces	Feb 1, 2024	Computational EfficiencyGPU	CodeCode Available	3
KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization	Jan 31, 2024	GPUQuantization	CodeCode Available	3
FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design	Jan 25, 2024	GPUQuantization	CodeCode Available	3
MoE-Infinity: Efficient MoE Inference on Personal Machines with Sparsity-Aware Expert Cache	Jan 25, 2024	GPUmodel	CodeCode Available	3
Inferflow: an Efficient and Highly Configurable Inference Engine for Large Language Models	Jan 16, 2024	GPUQuantization	CodeCode Available	3
RoSA: Accurate Parameter-Efficient Fine-Tuning via Robust Adaptation	Jan 9, 2024	GPUMath	CodeCode Available	3
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models	Jan 9, 2024	GPU	CodeCode Available	3
MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile Devices	Dec 28, 2023	AutoMLCPU	CodeCode Available	3
XuanCe: A Comprehensive and Unified Deep Reinforcement Learning Library	Dec 25, 2023	CPUDeep Reinforcement Learning	CodeCode Available	3
Splatter Image: Ultra-Fast Single-View 3D Reconstruction	Dec 20, 2023	3D Object Reconstruction3D Reconstruction	CodeCode Available	3
S-LoRA: Serving Thousands of Concurrent LoRA Adapters	Nov 6, 2023	GPUparameter-efficient fine-tuning	CodeCode Available	3
Punica: Multi-Tenant LoRA Serving	Oct 28, 2023	GPU	CodeCode Available	3
TorchSparse++: Efficient Training and Inference Framework for Sparse Convolution on GPUs	Oct 25, 2023	Autonomous DrivingGPU	CodeCode Available	3
Take the aTrain. Introducing an Interface for the Accessible Transcription of Interviews	Oct 18, 2023	CPUGPU	CodeCode Available	3
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation	Sep 27, 2023	GPUText-to-Video Generation	CodeCode Available	3
InstaFlow: One Step is Enough for High-Quality Diffusion-Based Text-to-Image Generation	Sep 12, 2023	GPUImage Generation	CodeCode Available	3
nanoT5: A PyTorch Framework for Pre-training and Fine-tuning T5-style Models with Limited Resources	Sep 5, 2023	DecoderGPU	CodeCode Available	3
Retentive Network: A Successor to Transformer for Large Language Models	Jul 17, 2023	GPULanguage Modeling	CodeCode Available	3
TAPIR: Tracking Any Point with per-frame Initialization and temporal Refinement	Jun 14, 2023	GPUMotion Estimation	CodeCode Available	3
Fine-Tuning Language Models with Just Forward Passes	May 27, 2023	GPUIn-Context Learning	CodeCode Available	3
Unlimiformer: Long-Range Transformers with Unlimited Length Input	May 2, 2023	Book summarizationCPU	CodeCode Available	3
TorchBench: Benchmarking PyTorch with High API Surface Coverage	Apr 27, 2023	BenchmarkingGPU	CodeCode Available	3
FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization	Mar 24, 2023	3D Hand Pose EstimationGPU	CodeCode Available	3
EvoTorch: Scalable Evolutionary Computation in Python	Feb 24, 2023	GPUreinforcement-learning	CodeCode Available	3

Show:10 25 50

← PrevPage 10 of 226Next →

No leaderboard results yet.