SOTAVerified

GPU

Papers

Showing 901950 of 5629 papers

TitleStatusHype
Reinforcement learning with learned gadgets to tackle hard quantum problems on real hardwareCode0
A Novel Breast Ultrasound Image Augmentation Method Using Advanced Neural Style Transfer: An Efficient and Explainable Approach0
The Importance of Being Scalable: Improving the Speed and Accuracy of Neural Network Interatomic Potentials Across Chemical DomainsCode2
Context-Aware Token Selection and Packing for Enhanced Vision Transformer0
Cycle-Constrained Adversarial Denoising Convolutional Network for PET Image Denoising: Multi-Dimensional Validation on Large Datasets with Reader Study and Real Low-Dose Data0
Very fast Bayesian Additive Regression Trees on GPUCode2
$100K or 100 Days: Trade-offs when Pre-Training with Academic ResourcesCode2
A Message Passing Neural Network Surrogate Model for Bond-Associated Peridynamic Material Correspondence Formulation0
AI-assisted Agile Propagation Modeling for Real-time Digital Twin Wireless Networks0
Application of Audio Fingerprinting Techniques for Real-Time Scalable Speech Retrieval and Speech Clusterization0
VL-Cache: Sparsity and Modality-Aware KV Cache Compression for Vision-Language Model Inference Acceleration0
Motion Graph Unleashed: A Novel Approach to Video PredictionCode0
Memory-Efficient Point Cloud Registration via Overlapping Region Sampling0
Revisiting Reliability in Large-Scale Machine Learning Research Clusters0
Pushing the Performance Envelope of DNN-based Recommendation Systems Inference on GPUsCode0
Data Generation for Hardware-Friendly Post-Training QuantizationCode3
ProMoE: Fast MoE-based LLM Serving using Proactive Caching0
ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM InferenceCode3
Accelerated Bayesian parameter estimation and model selection for gravitational waves with normalizing flows0
FusedInf: Efficient Swapping of DNN Models for On-Demand Serverless Inference Services on the EdgeCode0
Modular Duality in Deep LearningCode3
KD-LoRA: A Hybrid Approach to Efficient Fine-Tuning with LoRA and Knowledge DistillationCode1
ThunderKittens: Simple, Fast, and Adorable AI KernelsCode7
Deep Optimizer States: Towards Scalable Training of Transformer Models Using Interleaved OffloadingCode0
Computational Bottlenecks of Training Small-scale Large Language Models0
Parameter-Efficient Fine-Tuning in Large Models: A Survey of Methodologies0
KVSharer: Efficient Inference via Layer-Wise Dissimilar KV Cache SharingCode1
Sort-free Gaussian Splatting via Weighted Sum Rendering0
LOGO -- Long cOntext aliGnment via efficient preference OptimizationCode1
LoRANN: Low-Rank Matrix Factorization for Approximate Nearest Neighbor SearchCode2
Trajectory Optimization for Spatial Microstructure Control in Electron Beam Metal Additive Manufacturing0
CoreInfer: Accelerating Large Language Model Inference with Semantics-Inspired Adaptive Sparse Activation0
Is the GPU Half-Empty or Half-Full? Practical Scheduling Techniques for LLMs0
POD-Attention: Unlocking Full Prefill-Decode Overlap for Faster LLM InferenceCode0
ExpertFlow: Optimized Expert Activation and Token Allocation for Efficient Mixture-of-Experts Inference0
AI-focused HPC Data Centers Can Provide More Power Grid Flexibility and at Lower Cost0
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive LossCode3
Optimizing Mixture-of-Experts Inference Time Combining Model Deployment and Communication Scheduling0
Semantic-guided Search for Efficient Program Repair with Large Language Models0
FastAttention: Extend FlashAttention2 to NPUs and Low-resource GPUs0
MagicPIG: LSH Sampling for Efficient LLM GenerationCode3
Mean-Field Simulation-Based Inference for Cosmological Initial Conditions0
Enabling Energy-Efficient Deployment of Large Language Models on Memristor Crossbar: A Synergy of Large and Small0
Fully Explicit Dynamic Gaussian Splatting0
CompAct: Compressed Activations for Memory-Efficient LLM Training0
A Remedy to Compute-in-Memory with Dynamic Random Access Memory: 1FeFET-1C Technology for Neuro-Symbolic AI0
SemiHVision: Enhancing Medical Multimodal Models with a Semi-Human Annotated Dataset and Fine-Tuned Instruction GenerationCode0
Accelerate Coastal Ocean Circulation Model with AI Surrogate0
Adversarial Score identity Distillation: Rapidly Surpassing the Teacher in One StepCode2
AC-Mix: Self-Supervised Adaptation for Low-Resource Automatic Speech Recognition using Agnostic Contrastive Mixup0
Show:102550
← PrevPage 19 of 113Next →

No leaderboard results yet.