SOTAVerified

GPU

Papers

Showing 25512600 of 5629 papers

TitleStatusHype
Optimus: Accelerating Large-Scale Multi-Modal LLM Training by Bubble Exploitation0
Quantum Annealing based Power Grid Partitioning for Parallel Simulation0
PackMamba: Efficient Processing of Variable-Length Sequences in Mamba training0
L3iTC at the FinLLM Challenge Task: Quantization for Financial Text Classification & Summarization0
A Real-Time Adaptive Multi-Stream GPU System for Online Approximate Nearest Neighborhood Search0
SLO-aware GPU Frequency Scaling for Energy Efficient LLM Inference Serving0
VoxelTrack: Exploring Voxel Representation for 3D Point Cloud Object Tracking0
PanoFree: Tuning-Free Holistic Multi-view Image Generation with Cross-view Self-Guidance0
FT K-means: A High-Performance K-means on GPU with Fault ToleranceCode0
The Impact of Hyperparameters on Large Language Model Inference Performance: An Evaluation of vLLM and HuggingFace Pipelines0
Data-Driven Traffic Simulation for an Intersection in a Metropolis0
Enabling High Data Throughput Reinforcement Learning on GPUs: A Domain Agnostic Framework for Data-Driven Scientific Research0
Towards Scalable GPU-Accelerated SNN Training via Temporal FusionCode0
Finch: Prompt-guided Key-Value Cache Compression0
ThinK: Thinner Key Cache by Query-Driven Pruning0
NeuroSEM: A hybrid framework for simulating multiphysics problems by coupling PINNs and spectral elementsCode0
GPU-based data processing for speeding-up correlation plenoptic imaging0
Toward Efficient Permutation for Hierarchical N:M Sparsity on GPUs0
ByteCheckpoint: A Unified Checkpointing System for Large Foundation Model Development0
Graphite: A Graph-based Extreme Multi-Label Short Text Classifier for Keyphrase Recommendation0
Simply Trainable Nearest Neighbour Machine Translation with GPU Inference0
SAPG: Split and Aggregate Policy Gradients0
Mini-batch Coresets for Memory-efficient Training of Large Language Models0
WindsorML: High-Fidelity Computational Fluid Dynamics Dataset For Automotive Aerodynamics0
NARVis: Neural Accelerated Rendering for Real-Time Scientific Point Cloud Visualization0
Textile Anomaly Detection: Evaluation of the State-of-the-Art for Automated Quality Inspection of Carpet0
HG-PIPE: Vision Transformer Acceleration with Hybrid-Grained Pipeline0
Keep the Cost Down: A Review on Methods to Optimize LLM' s KV-Cache ConsumptionCode0
SPLAT: A framework for optimised GPU code-generation for SParse reguLar ATtention0
A Pairwise Comparison Relation-assisted Multi-objective Evolutionary Neural Architecture Search Method with Multi-population Mechanism0
Automated Road Safety: Enhancing Sign and Surface Damage Detection with AI0
LSM-GNN: Large-scale Storage-based Multi-GPU GNN Training by Optimizing Data Transfer Scheme0
MedSAGa: Few-shot Memory Efficient Medical Image Segmentation using Gradient Low-Rank Projection in SAM0
GreenStableYolo: Optimizing Inference Time and Image Quality of Text-to-Image GenerationCode0
Neural topology optimization: the good, the bad, and the ugly0
Performance Modeling and Workload Analysis of Distributed Large Language Model Training and Inference0
Mixture of Experts with Mixture of Precisions for Tuning Quality of Service0
LiNR: Model Based Neural Retrieval on GPUs at LinkedIn0
RoDE: Linear Rectified Mixture of Diverse Experts for Food Large Multi-Modal Models0
SmartQuant: CXL-based AI Model Store in Support of Runtime Configurable Weight Quantization0
ARTEMIS: A Mixed Analog-Stochastic In-DRAM Accelerator for Transformer Neural Networks0
Tiled Bit Networks: Sub-Bit Neural Network Compression Through Reuse of Learnable Binary Vectors0
Characterizing and Understanding HGNN Training on GPUs0
MEMO: Fine-grained Tensor Management For Ultra-long Context LLM Training0
Learning Multi-view Anomaly Detection0
PADRe: A Unifying Polynomial Attention Drop-in Replacement for Efficient Vision Transformer0
MINI-LLM: Memory-Efficient Structured Pruning for Large Language Models0
SuperPADL: Scaling Language-Directed Physics-Based Control with Progressive Supervised Distillation0
Differentiable Neural-Integrated Meshfree Method for Forward and Inverse Modeling of Finite Strain HyperelasticityCode0
NGP-RT: Fusing Multi-Level Hash Features with Lightweight Attention for Real-Time Novel View Synthesis0
Show:102550
← PrevPage 52 of 113Next →

No leaderboard results yet.