SOTAVerified

GPU

Papers

Showing 20512100 of 5629 papers

TitleStatusHype
Multiscale Byte Language Models -- A Hierarchical Architecture for Causal Million-Length Sequence ModelingCode0
Towards Efficient Automatic Self-Pruning of Large Language Models0
Distributed U-net model and Image Segmentation for Lung Cancer Detection0
Determining Layer-wise Sparsity for Large Language Models Through a Theoretical Perspective0
ParallelComp: Parallel Long-Context Compressor for Length Extrapolation0
FairKV: Balancing Per-Head KV Cache for Fast Multi-GPU Inference0
Learning conformational ensembles of proteins based on backbone geometry0
Activation-aware Probe-Query: Effective Key-Value Retrieval for Long-Context LLMs Inference0
GPU-Friendly Laplacian Texture Blending0
LSR-Adapt: Ultra-Efficient Parameter Tuning with Matrix Low Separation Rank Kernel Adaptation0
SPPD: Self-training with Process Preference Learning Using Dynamic Value Margin0
RocketKV: Accelerating Long-Context LLM Inference via Two-Stage KV Cache Compression0
MEX: Memory-efficient Approach to Referring Multi-Object Tracking0
Astra: Efficient and Money-saving Automatic Parallel Strategies Search on Heterogeneous GPUs0
An Experimental Study of SOTA LiDAR Segmentation Models0
BaKlaVa -- Budgeted Allocation of KV cache for Long-context Inference0
SEA: Low-Resource Safety Alignment for Multimodal Large Language Models via Synthetic EmbeddingsCode0
GPU Memory Usage Optimization for Backward Propagation in Deep Network Training0
Rotate, Clip, and Partition: Towards W2A4KV4 Quantization by Integrating Rotation and Learnable Non-uniform Quantizer0
Energy-Conscious LLM Decoding: Impact of Text Generation Strategies on GPU Energy Consumption0
Real-time Neural Rendering of LiDAR Point Clouds0
Fate: Fast Edge Inference of Mixture-of-Experts Models via Cross-Layer GateCode0
Massively Scaling Explicit Policy-conditioned Value Functions0
GPU-accelerated Multi-relational Parallel Graph Retrieval for Web-scale Recommendations0
TPCap: Unlocking Zero-Shot Image Captioning with Trigger-Augmented and Multi-Modal Purification Modules0
JExplore: Design Space Exploration Tool for Nvidia Jetson BoardsCode0
An Efficient Large Recommendation Model: Towards a Resource-Optimal Scaling Law0
E-MD3C: Taming Masked Diffusion Transformers for Efficient Zero-Shot Object Customization0
On LLM-generated Logic Programs and their Inference Execution Methods0
Latents of latents to delineate pixels: hybrid Matryoshka autoencoder-to-U-Net pairing for segmenting large medical images in GPU-poor and low-data regimes0
Efficient solution validation of constraint satisfaction problems on neuromorphic hardware: the case of Sudoku puzzlesCode0
InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU0
CopySpec: Accelerating LLMs with Speculative Copy-and-Paste Without Compromising QualityCode0
High-Throughput SAT SamplingCode0
Neuromorphic Principles for Efficient Large Language Models on Intel Loihi 20
Numerical Schemes for Signature KernelsCode0
Inference-time sparse attention with asymmetric indexing0
Democratizing AI: Open-source Scalable LLM Training on GPU-based Supercomputers0
Fast-COS: A Fast One-Stage Object Detector Based on Reparameterized Attention Vision Transformer for Autonomous Driving0
Memory Analysis on the Training Course of DeepSeek Models0
Memory Is Not the Bottleneck: Cost-Efficient Continual Learning via Weight Space Consolidation0
Exploiting Sparsity for Long Context Inference: Million Token Contexts on Commodity GPUsCode0
MoETuner: Optimized Mixture of Expert Serving with Balanced Expert Placement and Token Routing0
Accelerating Outlier-robust Rotation Estimation by Stereographic Projection0
Klotski: Efficient Mixture-of-Expert Inference via Expert-Aware Multi-Batch PipelineCode0
Crypto Miner Attack: GPU Remote Code Execution Attacks0
fMoE: Fine-Grained Expert Offloading for Large Mixture-of-Experts Serving0
InfiniteHBD: Building Datacenter-Scale High-Bandwidth Domain for LLM with Optical Circuit Switching Transceivers0
Unrealized Expectations: Comparing AI Methods vs Classical Algorithms for Maximum Independent Set0
Fast Sampling of Cosmological Initial Conditions with Gaussian Neural Posterior Estimation0
Show:102550
← PrevPage 42 of 113Next →

No leaderboard results yet.