SOTAVerified

GPU

Papers

Showing 48014850 of 5629 papers

TitleStatusHype
Olica: Efficient Structured Pruning of Large Language Models without RetrainingCode0
Theoretical Proportion Label Perturbation for Learning from Label Proportions in Large BagsCode0
Efficient semantic image segmentation with superpixel poolingCode0
Mobius: A High Efficient Spatial-Temporal Parallel Training Paradigm for Text-to-Video Generation TaskCode0
MobiRNN: Efficient Recurrent Neural Network Execution on Mobile GPUCode0
MobileDets: Searching for Object Detection Architectures for Mobile AcceleratorsCode0
MLitB: Machine Learning in the BrowserCode0
MLAAN: Scaling Supervised Local Learning with Multilaminar Leap Augmented Auxiliary NetworkCode0
Mitigating the Impact of Outlier Channels for Language Model Quantization with Activation RegularizationCode0
The Overcooked Generalisation ChallengeCode0
A Simple Model for Portable and Fast Prediction of Execution Time and Power Consumption of GPU KernelsCode0
Wavelet Flow: Fast Training of High Resolution Normalizing FlowsCode0
AcceleratedLiNGAM: Learning Causal DAGs at the speed of GPUsCode0
SFA: Small Faces Attention Face DetectorCode0
Efficient Parallel Methods for Deep Reinforcement LearningCode0
ASAP-NMS: Accelerating Non-Maximum Suppression Using Spatially Aware PriorsCode0
On Boosting Semantic Street Scene Segmentation with Weak SupervisionCode0
ALTIS: Modernizing GPGPU BenchmarkingCode0
Efficient Multi-Organ Segmentation Using SpatialConfiguration-Net with Low GPU Memory RequirementsCode0
MIOpen: An Open Source Library For Deep Learning PrimitivesCode0
Mind the Memory Gap: Unveiling GPU Bottlenecks in Large-Batch LLM InferenceCode0
Shallow Cross-Encoders for Low-Latency RetrievalCode0
Efficient MPI-based Communication for GPU-Accelerated Dask ApplicationsCode0
Shape Inpainting using 3D Generative Adversarial Network and Recurrent Convolutional NetworksCode0
AdversarialNAS: Adversarial Neural Architecture Search for GANsCode0
FastFace: Fast-converging Scheduler for Large-scale Face Recognition Training with One GPUCode0
KVPR: Efficient LLM Inference with I/O-Aware KV Cache Partial RecomputationCode0
MG-WFBP: Merging Gradients Wisely for Efficient Communication in Distributed Deep LearningCode0
Shavette: Low Power Neural Network Acceleration via Algorithm-level Error Detection and UndervoltingCode0
MG-GCN: Scalable Multi-GPU GCN Training FrameworkCode0
METER: a mobile vision transformer architecture for monocular depth estimationCode0
cito: An R package for training neural networks using torchCode0
Meta Networks for Neural Style TransferCode0
Message Scheduling for Performant, Many-Core Belief PropagationCode0
Posterior-Guided Neural Architecture SearchCode0
On Exact Computation with an Infinitely Wide Neural NetCode0
Memory-efficient Segmentation of High-resolution Volumetric MicroCT ImagesCode0
Advancing Video Self-Supervised Learning via Image Foundation ModelsCode0
Characterizing and Modeling Distributed Training with Transient Cloud GPU ServersCode0
A Runtime-Adaptive Transformer Neural Network Accelerator on FPGAsCode0
Online Energy Optimization in GPUs: A Multi-Armed Bandit ApproachCode0
A Distributed Synchronous SGD Algorithm with Global Top-k Sparsification for Low Bandwidth NetworksCode0
Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LMCode0
Online Tensor Methods for Learning Latent Variable ModelsCode0
Memory-Efficient Implementation of DenseNetsCode0
Efficient Large-scale Approximate Nearest Neighbor Search on the GPUCode0
Megapixel Image Generation with Step-Unrolled Denoising AutoencodersCode0
Measuring the Energy Consumption and Efficiency of Deep Neural Networks: An Empirical Analysis and Design RecommendationsCode0
maxDNN: An Efficient Convolution Kernel for Deep Learning with Maxwell GPUsCode0
SHTOcc: Effective 3D Occupancy Prediction with Sparse Head and Tail VoxelsCode0
Show:102550
← PrevPage 97 of 113Next →

No leaderboard results yet.