SOTAVerified

GPU

Papers

Showing 19512000 of 5629 papers

TitleStatusHype
A Case Study in CUDA Kernel Fusion: Implementing FlashAttention-2 on NVIDIA Hopper Architecture using the CUTLASS LibraryCode2
Regulating Intermediate 3D Features for Vision-Centric Autonomous DrivingCode1
GauFRe: Gaussian Deformation Fields for Real-time Dynamic Novel View Synthesis0
Tuning LayerNorm in Attention: Towards Efficient Multi-Modal LLM Finetuning0
PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPUCode5
Opara: Exploiting Operator Parallelism for Expediting DNN Inference on GPUsCode1
RetailKLIP : Finetuning OpenCLIP backbone using metric learning on a single GPU for Zero-shot retail product image classification0
FastSR-NeRF: Improving NeRF Efficiency on Consumer Devices with A Simple Super-Resolution Pipeline0
Binary Code Summarization: Benchmarking ChatGPT/GPT-4 and Other Large Language ModelsCode1
Data-Efficient Multimodal Fusion on a Single GPUCode1
LiteVSR: Efficient Visual Speech Recognition by Learning from Speech Representations of Unlabeled Data0
A parallelized cellular Potts model that enables simulations at tissue scaleCode0
MaxK-GNN: Extremely Fast GPU Kernel Design for Accelerating Graph Neural Networks TrainingCode1
A Sparse Cross Attention-based Graph Convolution Network with Auxiliary Information Awareness for Traffic Flow Prediction0
Context-PEFT: Efficient Multi-Modal, Multi-Task Fine-Tuning0
Dataset Distillation via Adversarial Prediction MatchingCode0
Efficient-NeRF2NeRF: Streamlining Text-Driven 3D Editing with Multiview Correspondence-Enhanced Diffusion Models0
Contractive error feedback for gradient compression0
EZ-CLIP: Efficient Zeroshot Video Action RecognitionCode1
CBQ: Cross-Block Quantization for Large Language Models0
Modality Plug-and-Play: Elastic Modality Adaptation in Multimodal LLMs for Embodied AICode1
DTL: Disentangled Transfer Learning for Visual RecognitionCode1
Memory-Efficient Reversible Spiking Neural NetworksCode1
LLM in a flash: Efficient Large Language Model Inference with Limited Memory0
Neural Video Fields Editing0
XC-NAS: A New Cellular Encoding Approach for Neural Architecture Search of Multi-path Convolutional Neural Networks0
Rethinking Compression: Reduced Order Modelling of Latent Features in Large Language ModelsCode1
GateNet: A novel Neural Network Architecture for Automated Flow Cytometry GatingCode1
FULL-W2V: Fully Exploiting Data Reuse for W2V on GPU-Accelerated SystemsCode0
Exploring Plain ViT Reconstruction for Multi-class Unsupervised Anomaly Detection0
RTMO: Towards High-Performance One-Stage Real-Time Multi-Person Pose Estimation0
PatchMorph: A Stochastic Deep Learning Approach for Unsupervised 3D Brain Image Registration with Small Patches0
DYAD: A Descriptive Yet Abjuring Density efficient approximation to linear neural network layersCode0
BACTrack: Building Appearance Collection for Aerial Tracking0
Compound Text-Guided Prompt Tuning via Image-Adaptive CuesCode1
Stateful Large Language Model Serving with Pensieve0
PILLOW: Enhancing Efficient Instruction Fine-tuning via Prompt Matching0
PixLore: A Dataset-driven Approach to Rich Image CaptioningCode0
Tenplex: Dynamic Parallelism for Deep Learning using Parallelizable Tensor CollectionsCode1
DARLEI: Deep Accelerated Reinforcement Learning with Evolutionary Intelligence0
Approximate Caching for Efficiently Serving Diffusion Models0
PerSival: Neural-network-based visualisation for pervasive continuum-mechanical simulations in musculoskeletal biomechanics0
SmoothQuant+: Accurate and Efficient 4-bit Post-Training WeightQuantization for LLMCode1
MMM: Generative Masked Motion ModelCode1
On the Diversity and Realism of Distilled Dataset: An Efficient Dataset Distillation ParadigmCode1
Holmes: Towards Distributed Training Across Clusters with Heterogeneous NIC Environment0
A Hardware Evaluation Framework for Large Language Model Inference0
FlexModel: A Framework for Interpretability of Distributed Large Language ModelsCode1
Learning to Holistically Detect Bridges from Large-Size VHR Remote Sensing Imagery0
DIPR: Efficient Point Cloud Registration via Dynamic Iteration0
Show:102550
← PrevPage 40 of 113Next →

No leaderboard results yet.