SOTAVerified

GPU

Papers

Showing 12511300 of 5629 papers

TitleStatusHype
A Pairwise Comparison Relation-assisted Multi-objective Evolutionary Neural Architecture Search Method with Multi-population Mechanism0
Automated Road Safety: Enhancing Sign and Surface Damage Detection with AI0
vTensor: Flexible Virtual Tensor Management for Efficient LLM ServingCode3
MedSAGa: Few-shot Memory Efficient Medical Image Segmentation using Gradient Low-Rank Projection in SAM0
LSM-GNN: Large-scale Storage-based Multi-GPU GNN Training by Optimizing Data Transfer Scheme0
GreenStableYolo: Optimizing Inference Time and Image Quality of Text-to-Image GenerationCode0
Performance Modeling and Workload Analysis of Distributed Large Language Model Training and Inference0
Mixture of Experts with Mixture of Precisions for Tuning Quality of Service0
Neural topology optimization: the good, the bad, and the ugly0
NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model InternalsCode4
Forecasting GPU Performance for Deep Learning Training and InferenceCode2
Attention in SRAM on Tenstorrent GrayskullCode1
WiNet: Wavelet-based Incremental Learning for Efficient Medical Image RegistrationCode1
LiNR: Model Based Neural Retrieval on GPUs at LinkedIn0
Visual Haystacks: A Vision-Centric Needle-In-A-Haystack BenchmarkCode1
SmartQuant: CXL-based AI Model Store in Support of Runtime Configurable Weight Quantization0
FastSAM-3DSlicer: A 3D-Slicer Extension for 3D Volumetric Segment Anything Model with Uncertainty QuantificationCode1
Spectra: Surprising Effectiveness of Pretraining Ternary Language Models at ScaleCode2
RoDE: Linear Rectified Mixture of Diverse Experts for Food Large Multi-Modal Models0
ARTEMIS: A Mixed Analog-Stochastic In-DRAM Accelerator for Transformer Neural Networks0
MEMO: Fine-grained Tensor Management For Ultra-long Context LLM Training0
PADRe: A Unifying Polynomial Attention Drop-in Replacement for Efficient Vision Transformer0
Learning Multi-view Anomaly Detection0
MINI-LLM: Memory-Efficient Structured Pruning for Large Language Models0
Tiled Bit Networks: Sub-Bit Neural Network Compression Through Reuse of Learnable Binary Vectors0
Characterizing and Understanding HGNN Training on GPUs0
Differentiable Voxelization and Mesh MorphingCode2
Differentiable Neural-Integrated Meshfree Method for Forward and Inverse Modeling of Finite Strain HyperelasticityCode0
Separable Operator NetworksCode1
From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank GradientsCode2
NGP-RT: Fusing Multi-Level Hash Features with Lightweight Attention for Real-Time Novel View Synthesis0
SuperPADL: Scaling Language-Directed Physics-Based Control with Progressive Supervised Distillation0
LeanQuant: Accurate Large Language Model Quantization with Loss-Error-Aware Grid0
Disrupting Diffusion-based Inpainters with Semantic Digression0
LeRF: Learning Resampling Function for Adaptive and Efficient Image InterpolationCode1
Enhancing Training Efficiency Using Packing with Flash Attention0
Weight Block Sparsity: Training, Compilation, and AI Engine Accelerators0
FedMedICL: Towards Holistic Evaluation of Distribution Shifts in Federated Medical ImagingCode1
FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precisionCode12
Gradient Boosting Reinforcement LearningCode2
EfficientQAT: Efficient Quantization-Aware Training for Large Language ModelsCode3
Analyzing Machine Learning Performance in a Hybrid Quantum Computing and HPC Environment0
INSIGHT: Universal Neural Simulator for Analog Circuits Harnessing Autoregressive Transformers0
MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image SynthesisCode2
Swiss DINO: Efficient and Versatile Vision Framework for On-device Personal Object SearchCode0
Inference Performance Optimization for Large Language Models on CPUsCode3
HAFormer: Unleashing the Power of Hierarchy-Aware Features for Lightweight Semantic SegmentationCode0
Parameter Efficient Fine Tuning for Multi-scanner PET to PET Reconstruction0
3D Gaussian Ray Tracing: Fast Tracing of Particle Scenes0
Mobius: A High Efficient Spatial-Temporal Parallel Training Paradigm for Text-to-Video Generation TaskCode0
Show:102550
← PrevPage 26 of 113Next →

No leaderboard results yet.