SOTAVerified

GPU

Papers

Showing 10011050 of 5629 papers

TitleStatusHype
PostEdit: Posterior Sampling for Efficient Zero-Shot Image EditingCode1
CTC-GMM: CTC guided modality matching for fast and accurate streaming speech translation0
Large Language Model Inference Acceleration: A Comprehensive Hardware PerspectiveCode1
PalmBench: A Comprehensive Benchmark of Compressed Large Language Models on Mobile Platforms0
High-Speed Stereo Visual SLAM for Low-Powered Computing DevicesCode3
Fast Object Detection with a Machine Learning Edge Device0
SwiftKV: Fast Prefill-Optimized Inference with Knowledge-Preserving Model TransformationCode3
Dolphin: A Programmable Framework for Scalable Neurosymbolic Learning0
Compute Or Load KV Cache? Why Not Both?0
LoRC: Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy0
Online Energy Optimization in GPUs: A Multi-Armed Bandit ApproachCode0
LLM-Pilot: Characterize and Optimize Performance of your LLM Inference ServicesCode1
Efficient Semantic Segmentation via Lightweight Multiple-Information Interaction Network0
Learning from Offline Foundation Features with Tensor Augmentations0
Efficient Residual Learning with Mixture-of-Experts for Universal Dexterous Grasping0
LLMCO2: Advancing Accurate Carbon Footprint Prediction for LLM Inferences0
An Efficient Inference Frame for SMLM (Single-Molecule Localization Microscopy)Code0
Contextual Document Embeddings0
Depth Pro: Sharp Monocular Metric Depth in Less Than a SecondCode9
Locret: Enhancing Eviction in Long-Context LLM Inference with Trained Retaining Heads on Consumer-Grade DevicesCode1
A Little Goes a Long Way: Efficient Long Context Training and Inference with Partial Contexts0
FlashMask: Efficient and Rich Mask Extension of FlashAttention0
Scalable and Consistent Graph Neural Networks for Distributed Mesh-based Data-driven Modeling0
Replacement Learning: Training Vision Tasks with Fewer Learnable Parameters0
ConServe: Harvesting GPUs for Low-Latency and High-Throughput Large Language Model Serving0
VectorGraphNET: Graph Attention Networks for Accurate Segmentation of Complex Technical Drawings0
TorchSISSO: A PyTorch-Based Implementation of the Sure Independence Screening and Sparsifying Operator for Efficient and Interpretable Model DiscoveryCode1
Lotus: learning-based online thermal and latency variation management for two-stage detectors on edge devicesCode0
ROK Defense M&S in the Age of Hyperscale AI: Concepts, Challenges, and Future Directions0
MoS: Unleashing Parameter Efficiency of Low-Rank Adaptation with Mixture of Shards0
STGformer: Efficient Spatiotemporal Graph Transformer for Traffic ForecastingCode1
LayerKV: Optimizing Large Language Model Serving with Layer-wise KV Cache ManagementCode3
ManiSkill3: GPU Parallelized Robotics Simulation and Rendering for Generalizable Embodied AICode7
Characterizing and Efficiently Accelerating Multimodal Generation Model Inference0
HEADS-UP: Head-Mounted Egocentric Dataset for Trajectory Prediction in Blind Assistance Systems0
Simple and Fast Distillation of Diffusion ModelsCode3
Simulation-based inference with the Python Package sbijax0
Analog In-Memory Computing Attention Mechanism for Fast and Energy-Efficient Large Language ModelsCode1
Gradient-free Decoder Inversion in Latent Diffusion Models0
TensorSocket: Shared Data Loading for Deep Learning Training0
Scalable Cross-Entropy Loss for Sequential Recommendations with Large Item CatalogsCode1
DRL-STNet: Unsupervised Domain Adaptation for Cross-modality Medical Image Segmentation via Disentangled Representation Learning0
Input-Dependent Power Usage in GPUsCode0
Efficient Arbitrary Precision Acceleration for Large Language Models on GPU Tensor Cores0
Behaviour4All: in-the-wild Facial Behaviour Analysis Toolkit0
LightAvatar: Efficient Head Avatar as Dynamic Neural Light FieldCode1
MALPOLON: A Framework for Deep Species Distribution ModelingCode1
Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token ReductionCode2
Search for Efficient Large Language ModelsCode1
Efficient and generalizable nested Fourier-DeepONet for three-dimensional geological carbon sequestrationCode0
Show:102550
← PrevPage 21 of 113Next →

No leaderboard results yet.