SOTAVerified

GPU

Papers

Showing 10011025 of 5629 papers

TitleStatusHype
CTC-GMM: CTC guided modality matching for fast and accurate streaming speech translation0
PostEdit: Posterior Sampling for Efficient Zero-Shot Image EditingCode1
Large Language Model Inference Acceleration: A Comprehensive Hardware PerspectiveCode1
PalmBench: A Comprehensive Benchmark of Compressed Large Language Models on Mobile Platforms0
Fast Object Detection with a Machine Learning Edge Device0
High-Speed Stereo Visual SLAM for Low-Powered Computing DevicesCode3
Dolphin: A Programmable Framework for Scalable Neurosymbolic Learning0
Compute Or Load KV Cache? Why Not Both?0
SwiftKV: Fast Prefill-Optimized Inference with Knowledge-Preserving Model TransformationCode3
LoRC: Low-Rank Compression for LLMs KV Cache with a Progressive Compression Strategy0
Online Energy Optimization in GPUs: A Multi-Armed Bandit ApproachCode0
Efficient Residual Learning with Mixture-of-Experts for Universal Dexterous Grasping0
Efficient Semantic Segmentation via Lightweight Multiple-Information Interaction Network0
LLMCO2: Advancing Accurate Carbon Footprint Prediction for LLM Inferences0
LLM-Pilot: Characterize and Optimize Performance of your LLM Inference ServicesCode1
Learning from Offline Foundation Features with Tensor Augmentations0
An Efficient Inference Frame for SMLM (Single-Molecule Localization Microscopy)Code0
Contextual Document Embeddings0
Depth Pro: Sharp Monocular Metric Depth in Less Than a SecondCode9
A Little Goes a Long Way: Efficient Long Context Training and Inference with Partial Contexts0
Locret: Enhancing Eviction in Long-Context LLM Inference with Trained Retaining Heads on Consumer-Grade DevicesCode1
ConServe: Harvesting GPUs for Low-Latency and High-Throughput Large Language Model Serving0
Replacement Learning: Training Vision Tasks with Fewer Learnable Parameters0
TorchSISSO: A PyTorch-Based Implementation of the Sure Independence Screening and Sparsifying Operator for Efficient and Interpretable Model DiscoveryCode1
FlashMask: Efficient and Rich Mask Extension of FlashAttention0
Show:102550
← PrevPage 41 of 226Next →

No leaderboard results yet.