SOTAVerified

GPU

Papers

Showing 426450 of 5629 papers

TitleStatusHype
QUICK: Quantization-aware Interleaving and Conflict-free Kernel for efficient LLM inferenceCode2
On the Efficacy of Eviction Policy for Key-Value Constrained Generative Language Model InferenceCode2
λ-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent SpaceCode2
4D-Rotor Gaussian Splatting: Towards Efficient Novel View Synthesis for Dynamic ScenesCode2
Cross-Scale MAE: A Tale of Multi-Scale Exploitation in Remote SensingCode2
SHViT: Single-Head Vision Transformer with Memory Efficient Macro DesignCode2
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space ModelCode2
Towards High-Quality and Efficient Speech Bandwidth Extension with Parallel Amplitude and Phase PredictionCode2
Low-resource finetuning of foundation models beats state-of-the-art in histopathologyCode2
WidthFormer: Toward Efficient Transformer-based BEV View TransformationCode2
Parameter-Efficient Sparsity Crafting from Dense to Mixture-of-Experts for Instruction Tuning on General TasksCode2
CoMoSVC: Consistency Model-based Singing Voice ConversionCode2
MosaicBERT: A Bidirectional Encoder Optimized for Fast PretrainingCode2
Spacetime Gaussian Feature Splatting for Real-Time Dynamic View SynthesisCode2
Understanding the Potential of FPGA-Based Spatial Acceleration for Large Language Model InferenceCode2
A Case Study in CUDA Kernel Fusion: Implementing FlashAttention-2 on NVIDIA Hopper Architecture using the CUTLASS LibraryCode2
XLand-MiniGrid: Scalable Meta-Reinforcement Learning Environments in JAXCode2
mLoRA: Fine-Tuning LoRA Adapters via Highly-Efficient Pipeline Parallelism in Multiple GPUsCode2
CoLLiE: Collaborative Training of Large Language Models in an Efficient WayCode2
XLB: A differentiable massively parallel lattice Boltzmann library in PythonCode2
Learning to Fly in SecondsCode2
Using Human Feedback to Fine-tune Diffusion Models without Any Reward ModelCode2
JaxMARL: Multi-Agent RL Environments and Algorithms in JAXCode2
Fast Chain-of-Thought: A Glance of Future from Parallel Decoding Leads to Answers FasterCode2
Black-Box Prompt Optimization: Aligning Large Language Models without Model TrainingCode2
Show:102550
← PrevPage 18 of 226Next →

No leaderboard results yet.