SOTAVerified

GPU

Papers

Showing 101150 of 5629 papers

TitleStatusHype
DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion ModelsCode4
KernelBench: Can LLMs Write Efficient GPU Kernels?Code4
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length FloatCode4
GPUTreeShap: Massively Parallel Exact Calculation of SHAP Scores for Tree EnsemblesCode4
DeXtreme: Transfer of Agile In-hand Manipulation from Simulation to RealityCode4
Real-time volumetric rendering of dynamic humansCode4
Repurposing Diffusion-Based Image Generators for Monocular Depth EstimationCode4
RTMDet: An Empirical Study of Designing Real-Time Object DetectorsCode4
SocialED: A Python Library for Social Event DetectionCode4
TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference OptimizationCode4
fVDB: A Deep-Learning Framework for Sparse, Large-Scale, and High-Performance Spatial IntelligenceCode4
4D Gaussian Splatting for Real-Time Dynamic Scene RenderingCode4
Generating and Imputing Tabular Data via Diffusion and Flow-based Gradient-Boosted TreesCode4
Accelerating Visual-Policy Learning through Parallel Differentiable SimulationCode4
PLAID: An Efficient Engine for Late Interaction RetrievalCode4
Flash Diffusion: Accelerating Any Conditional Diffusion Model for Few Steps Image GenerationCode4
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image SynthesisCode4
CoTracker: It is Better to Track TogetherCode4
PointMamba: A Simple State Space Model for Point Cloud AnalysisCode4
Optimizing LLM Inference: Fluid-Guided Online Scheduling with Memory ConstraintsCode4
OnPrem.LLM: A Privacy-Conscious Document Intelligence ToolkitCode4
FedML Parrot: A Scalable Federated Learning System via Heterogeneity-aware Scheduling on Sequential and Hierarchical TrainingCode4
FFCV: Accelerating Training by Removing Data BottlenecksCode4
NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model InternalsCode4
Otter: A Multi-Modal Model with In-Context Instruction TuningCode4
Billion-scale similarity search with GPUsCode4
MoE++: Accelerating Mixture-of-Experts Methods with Zero-Computation ExpertsCode4
Moûsai: Text-to-Music Generation with Long-Context Latent DiffusionCode4
AudioLDM: Text-to-Audio Generation with Latent Diffusion ModelsCode4
Building reliable sim driving agents by scaling self-playCode4
EvoX: A Distributed GPU-accelerated Framework for Scalable Evolutionary ComputationCode4
fastai: A Layered API for Deep LearningCode4
Multi-head Temporal Latent AttentionCode4
PIN-SLAM: LiDAR SLAM Using a Point-Based Implicit Neural Representation for Achieving Global Map ConsistencyCode4
QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM ServingCode4
Theseus: A Library for Differentiable Nonlinear OptimizationCode4
MegaBlocks: Efficient Sparse Training with Mixture-of-ExpertsCode3
Merlin: A Vision Language Foundation Model for 3D Computed TomographyCode3
EscherNet: A Generative Model for Scalable View SynthesisCode3
MetaDE: Evolving Differential Evolution by Differential EvolutionCode3
3DGS-LM: Faster Gaussian-Splatting Optimization with Levenberg-MarquardtCode3
ASE: Large-Scale Reusable Adversarial Skill Embeddings for Physically Simulated CharactersCode3
Machine Learning in Python: Main developments and technology trends in data science, machine learning, and artificial intelligenceCode3
MagicPIG: LSH Sampling for Efficient LLM GenerationCode3
MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video UnderstandingCode3
Arctic Long Sequence Training: Scalable And Efficient Training For Multi-Million Token SequencesCode3
EfficientQAT: Efficient Quantization-Aware Training for Large Language ModelsCode3
Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context AccurayCode3
M+: Extending MemoryLLM with Scalable Long-Term MemoryCode3
LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at ScaleCode3
Show:102550
← PrevPage 3 of 113Next →

No leaderboard results yet.