SOTAVerified

GPU

Papers

Showing 101125 of 5629 papers

TitleStatusHype
Theseus: A Library for Differentiable Nonlinear OptimizationCode4
NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model InternalsCode4
Flash Diffusion: Accelerating Any Conditional Diffusion Model for Few Steps Image GenerationCode4
OnPrem.LLM: A Privacy-Conscious Document Intelligence ToolkitCode4
On Scaling Up 3D Gaussian Splatting TrainingCode4
Multi-head Temporal Latent AttentionCode4
Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel TrainingCode4
Building reliable sim driving agents by scaling self-playCode4
High-Resolution Image Synthesis with Latent Diffusion ModelsCode4
Optimizing LLM Inference: Fluid-Guided Online Scheduling with Memory ConstraintsCode4
DeepSpeed Inference: Enabling Efficient Inference of Transformer Models at Unprecedented ScaleCode4
MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation ExpertsCode4
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length FloatCode4
FedML Parrot: A Scalable Federated Learning System via Heterogeneity-aware Scheduling on Sequential and Hierarchical TrainingCode4
FFCV: Accelerating Training by Removing Data BottlenecksCode4
Moûsai: Text-to-Music Generation with Long-Context Latent DiffusionCode4
EvoX: A Distributed GPU-accelerated Framework for Scalable Evolutionary ComputationCode4
JAX-Fluids 2.0: Towards HPC for Differentiable CFD of Compressible Two-phase FlowsCode4
fastai: A Layered API for Deep LearningCode4
Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step InferenceCode4
4D Gaussian Splatting for Real-Time Dynamic Scene RenderingCode4
Billion-scale similarity search with GPUsCode4
Accelerating Visual-Policy Learning through Parallel Differentiable SimulationCode4
LCM-LoRA: A Universal Stable-Diffusion Acceleration ModuleCode4
Mamba-FETrack: Frame-Event Tracking via State Space ModelCode4
Show:102550
← PrevPage 5 of 226Next →

No leaderboard results yet.