SOTAVerified

GPU

Papers

Showing 23512375 of 5629 papers

TitleStatusHype
Sort-free Gaussian Splatting via Weighted Sum Rendering0
Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies0
ExpertFlow: Optimized Expert Activation and Token Allocation for Efficient Mixture-of-Experts Inference0
Trajectory Optimization for Spatial Microstructure Control in Electron Beam Metal Additive Manufacturing0
Is the GPU Half-Empty or Half-Full? Practical Scheduling Techniques for LLMs0
CoreInfer: Accelerating Large Language Model Inference with Semantics-Inspired Adaptive Sparse Activation0
POD-Attention: Unlocking Full Prefill-Decode Overlap for Faster LLM InferenceCode0
FastAttention: Extend FlashAttention2 to NPUs and Low-resource GPUs0
Semantic-guided Search for Efficient Program Repair with Large Language Models0
AI-focused HPC Data Centers Can Provide More Power Grid Flexibility and at Lower Cost0
Optimizing Mixture-of-Experts Inference Time Combining Model Deployment and Communication Scheduling0
Enabling Energy-Efficient Deployment of Large Language Models on Memristor Crossbar: A Synergy of Large and Small0
Mean-Field Simulation-Based Inference for Cosmological Initial Conditions0
Fully Explicit Dynamic Gaussian Splatting0
CompAct: Compressed Activations for Memory-Efficient LLM Training0
A Remedy to Compute-in-Memory with Dynamic Random Access Memory: 1FeFET-1C Technology for Neuro-Symbolic AI0
Accelerate Coastal Ocean Circulation Model with AI Surrogate0
SemiHVision: Enhancing Medical Multimodal Models with a Semi-Human Annotated Dataset and Fine-Tuned Instruction GenerationCode0
Evaluating Quantized Large Language Models for Code Generation on Low-Resource Language BenchmarksCode0
AC-Mix: Self-Supervised Adaptation for Low-Resource Automatic Speech Recognition using Agnostic Contrastive Mixup0
Parallel Backpropagation for Inverse of a Convolution with Application to Normalizing FlowsCode0
Takin-ADA: Emotion Controllable Audio-Driven Animation with Canonical and Landmark Loss Optimization0
Harnessing Your DRAM and SSD for Sustainable and Accessible LLM Inference with Mixed-Precision and Multi-level Caching0
MEGA: Memory-Efficient 4D Gaussian Splatting for Dynamic Scenes0
Shavette: Low Power Neural Network Acceleration via Algorithm-level Error Detection and UndervoltingCode0
Show:102550
← PrevPage 95 of 226Next →

No leaderboard results yet.