SOTAVerified

GPU

Papers

Showing 24012450 of 5629 papers

TitleStatusHype
LAMM: Language-Assisted Multi-Modal Instruction-Tuning Dataset, Framework, and BenchmarkCode1
Push: Concurrent Probabilistic Programming for Bayesian Deep LearningCode0
Finding Hamiltonian cycles with graph neural networksCode0
EfficientBioAI: Making Bioimaging AI Models Efficient in Energy, Latency and RepresentationCode1
S^3: Increasing GPU Utilization during Generative Inference for Higher Throughput0
StreetSurf: Extending Multi-view Implicit Surface Reconstruction to Street ViewsCode2
Does Long-Term Series Forecasting Need Complex Attention and Extra Long Inputs?Code1
Optimized Crystallographic Graph Generation for Material ScienceCode0
Modulation Classification Through Deep Learning Using Resolution Transformed Spectrograms0
Revisiting Neural Retrieval on AcceleratorsCode1
Towards Memory-Efficient Training for Extremely Large Output Spaces -- Learning with 500k Labels on a Single Commodity GPU0
DVIS: Decoupled Video Instance Segmentation FrameworkCode1
SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight CompressionCode2
DVFO: Learning-Based DVFS for Energy-Efficient Edge-Cloud Collaborative Inference0
Lightweight Vision Transformer with Bidirectional InteractionCode1
Wuerstchen: An Efficient Architecture for Large-Scale Text-to-Image Diffusion ModelsCode2
Accelerated Fingerprint Enhancement: A GPU-Optimized Mixed Architecture Approach0
Autism Disease Detection Using Transfer Learning Techniques: Performance Comparison Between Central Processing Unit vs Graphics Processing Unit Functions for Neural Networks0
Special Session: Approximation and Fault Resiliency of DNN Accelerators0
Adam Accumulation to Reduce Memory Footprints of both Activations and Gradients for Large-scale DNN Training0
Neuron to Graph: Interpreting Language Model Neurons at ScaleCode0
Edge-MoE: Memory-Efficient Multi-Task Vision Transformer Architecture with Task-level Sparsity via Mixture-of-ExpertsCode1
CTSN: Predicting Cloth Deformation for Skeleton-based Characters with a Two-stream Skinning Network0
SlimFit: Memory-Efficient Fine-Tuning of Transformer-based Models Using Training Dynamics0
Bringing regularized optimal transport to lightspeed: a splitting method adapted for GPUs0
Search-Based Regular Expression Inference on a GPUCode1
Fine-Tuning Language Models with Just Forward PassesCode3
Accelerating Text-to-Image Editing via Cache-Enabled Sparse Diffusion InferenceCode2
RT-kNNS Unbound: Using RT Cores to Accelerate Unrestricted Neighbor Search0
Pulse shape discrimination based on the Tempotron: a powerful classifier on GPUCode0
AudioDec: An Open-source Streaming High-fidelity Neural Audio CodecCode2
MixFormerV2: Efficient Fully Transformer TrackingCode2
Sliding Window Sum Algorithms for Deep Neural Networks0
Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and MemoryCode2
Dynamic Data Augmentation via MCTS for Prostate MRI SegmentationCode0
ComSL: A Composite Speech-Language Model for End-to-End Speech-to-Text TranslationCode1
Harnessing the Power of Large Language Models for Natural Language to First-Order Logic TranslationCode1
Optimal Linear Subspace Search: Learning to Construct Fast and High-Quality Schedulers for Diffusion ModelsCode0
AWESOME: GPU Memory-constrained Long Document Summarization using Memory Mechanism and Global Salient Content0
AutoDepthNet: High Frame Rate Depth Map Reconstruction using Commodity Depth and RGB Cameras0
READ: Recurrent Adaptation of Large Transformers0
Graph Analysis Using a GPU-based Parallel Algorithm: Quantum Clustering0
QLoRA: Efficient Finetuning of Quantized LLMsCode6
An Accelerated Pipeline for Multi-label Renal Pathology Image Segmentation at the Whole Slide Image LevelCode1
Enhancing Detail Preservation for Customized Text-to-Image Generation: A Regularization-Free ApproachCode2
Goat: Fine-tuned LLaMA Outperforms GPT-4 on Arithmetic TasksCode1
Can ChatGPT Detect Intent? Evaluating Large Language Models for Spoken Language Understanding0
Flover: A Temporal Fusion Framework for Efficient Autoregressive Model Parallel InferenceCode0
Integer or Floating Point? New Outlooks for Low-Bit Quantization on Large Language Models0
Taming Resource Heterogeneity In Distributed ML Training With Dynamic Batching0
Show:102550
← PrevPage 49 of 113Next →

No leaderboard results yet.