SOTAVerified

GPU

Papers

Showing 23012350 of 5629 papers

TitleStatusHype
AdaCM^2: On Understanding Extremely Long-Term Video with Adaptive Cross-Modality Memory Reduction0
Automated 3D Physical Simulation of Open-world Scene with Gaussian Splatting0
Modeling Multivariable High-resolution 3D Urban Microclimate Using Localized Fourier Neural Operator0
Graph Retention Networks for Dynamic GraphsCode0
MoE-Lightning: High-Throughput MoE Inference on Memory-constrained GPUs0
LP Data Pipeline: Lightweight, Purpose-driven Data Pipeline for Large Language Models0
Towards Accurate and Efficient Sub-8-Bit Integer Training0
NeuroNURBS: Learning Efficient Surface Representations for 3D Solids0
Improving training time and GPU utilization in geo-distributed language model training0
MDHP-Net: Detecting an Emerging Time-exciting Threat in IVN0
TEESlice: Protecting Sensitive Neural Network Models in Trusted Execution Environments When Attackers have Pre-Trained Models0
Pie: Pooling CPU Memory for LLM Inference0
SANDWICH: Towards an Offline, Differentiable, Fully-Trainable Wireless Neural Ray-Tracing SurrogateCode0
Optimizing LLM Inference for Database Systems: Cost-Aware Scheduling for Concurrent Requests0
On Adapting Randomized Nyström Preconditioners to Accelerate Variational Image Reconstruction0
FRUGAL: Memory-Efficient Optimization by Reducing State Overhead for Scalable TrainingCode0
OpenThaiGPT 1.5: A Thai-Centric Open Source Large Language Model0
Accelerating Large Language Model Training with 4D Parallelism and Memory Consumption Estimator0
KeyB2: Selecting Key Blocks is Also Important for Long Document Ranking with Large Language Models0
Benchmarking 3D multi-coil NC-PDNet MRI reconstruction0
Hardware and Software Platform Inference0
PropNEAT -- Efficient GPU-Compatible Backpropagation over NeuroEvolutionary Augmenting Topology Networks0
Reducing Hyperparameter Tuning Costs in ML, Vision and Language Model Training Pipelines via Memoization-AwarenessCode0
LEGO-GraphRAG: Modularizing Graph-based Retrieval-Augmented Generation for Design Space Exploration0
Efficient and Effective Adaptation of Multimodal Foundation Models in Sequential Recommendation0
"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization0
Context Parallelism for Scalable Million-Token Inference0
Stochastic Communication Avoidance for Recommendation Systems0
NEO: Saving GPU Memory Crisis with CPU Offloading for Online LLM InferenceCode0
Hollowed Net for On-Device Personalization of Text-to-Image Diffusion Models0
CRONOS: Enhancing Deep Learning with Scalable GPU Accelerated Convex Neural Networks0
HopTrack: A Real-time Multi-Object Tracking System for Embedded DevicesCode0
Computation-Aware Gaussian Processes: Model Selection And Linear-Time Inference0
A Novel Breast Ultrasound Image Augmentation Method Using Advanced Neural Style Transfer: An Efficient and Explainable Approach0
Cycle-Constrained Adversarial Denoising Convolutional Network for PET Image Denoising: Multi-Dimensional Validation on Large Datasets with Reader Study and Real Low-Dose Data0
Reinforcement learning with learned gadgets to tackle hard quantum problems on real hardwareCode0
Context-Aware Token Selection and Packing for Enhanced Vision Transformer0
ProMoE: Fast MoE-based LLM Serving using Proactive Caching0
Application of Audio Fingerprinting Techniques for Real-Time Scalable Speech Retrieval and Speech Clusterization0
Memory-Efficient Point Cloud Registration via Overlapping Region Sampling0
A Message Passing Neural Network Surrogate Model for Bond-Associated Peridynamic Material Correspondence Formulation0
Revisiting Reliability in Large-Scale Machine Learning Research Clusters0
AI-assisted Agile Propagation Modeling for Real-time Digital Twin Wireless Networks0
Motion Graph Unleashed: A Novel Approach to Video PredictionCode0
Pushing the Performance Envelope of DNN-based Recommendation Systems Inference on GPUsCode0
VL-Cache: Sparsity and Modality-Aware KV Cache Compression for Vision-Language Model Inference Acceleration0
Accelerated Bayesian parameter estimation and model selection for gravitational waves with normalizing flows0
FusedInf: Efficient Swapping of DNN Models for On-Demand Serverless Inference Services on the EdgeCode0
Deep Optimizer States: Towards Scalable Training of Transformer Models Using Interleaved OffloadingCode0
Computational Bottlenecks of Training Small-scale Large Language Models0
Show:102550
← PrevPage 47 of 113Next →

No leaderboard results yet.