SOTAVerified

GPU

Papers

Showing 11511200 of 5629 papers

TitleStatusHype
AtMan: Understanding Transformer Predictions Through Memory Efficient Attention ManipulationCode1
CuAsmRL: Optimizing GPU SASS Schedules via Deep Reinforcement LearningCode1
Effective Batching for Recurrent Neural Network GrammarsCode1
LightViT: Towards Light-Weight Convolution-Free Vision TransformersCode1
LiVOS: Light Video Object Segmentation with Gated Linear MatchingCode1
LightAvatar: Efficient Head Avatar as Dynamic Neural Light FieldCode1
A flexible and fast PyTorch toolkit for simulating training and inference on analog crossbar arraysCode1
Efficient Lifelong Model Evaluation in an Era of Rapid ProgressCode1
Easy and Efficient Transformer : Scalable Inference Solution For large NLP modelCode1
EAGAN: Efficient Two-stage Evolutionary Architecture Search for GANsCode1
Edge-MoE: Memory-Efficient Multi-Task Vision Transformer Architecture with Task-level Sparsity via Mixture-of-ExpertsCode1
Lettuce: PyTorch-based Lattice Boltzmann FrameworkCode1
AFDet: Anchor Free One Stage 3D Object DetectionCode1
CryptGPU: Fast Privacy-Preserving Machine Learning on the GPUCode1
Asyncval: A Toolkit for Asynchronously Validating Dense Retriever Checkpoints during TrainingCode1
LeetDecoding: A PyTorch Library for Exponentially Decaying Causal Linear Attention with CUDA ImplementationsCode1
EdgeNAT: Transformer for Efficient Edge DetectionCode1
Learning Tracking Representations via Dual-Branch Fully Transformer NetworksCode1
A Fast Post-Training Pruning Framework for TransformersCode1
Dynamic Sparse Training with Structured SparsityCode1
Learning Universal Shape Dictionary for Realtime Instance SegmentationCode1
LeRF: Learning Resampling Function for Adaptive and Efficient Image InterpolationCode1
LightRetriever: A LLM-based Hybrid Retrieval Architecture with 1000x Faster Query InferenceCode1
Dynamic Structure Pruning for Compressing CNNsCode1
Dynamic Low-Rank Sparse Adaptation for Large Language ModelsCode1
Asynchronous Methods for Deep Reinforcement LearningCode1
Dynamic Mesh-Aware Radiance FieldsCode1
Dynamic-OFA: Runtime DNN Architecture Switching for Performance Scaling on Heterogeneous Embedded PlatformsCode1
Learning to Generate Wasserstein BarycentersCode1
Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal LearningCode1
Edge and Identity Preserving Network for Face Super-ResolutionCode1
Dynamic GPU Energy Optimization for Machine Learning Training WorkloadsCode1
Efficient Video Compression via Content-Adaptive Super-ResolutionCode1
Dynamic Perceiver for Efficient Visual RecognitionCode1
Learning to Enhance Low-Light Image via Zero-Reference Deep Curve EstimationCode1
Learning to Upsample by Learning to SampleCode1
Cross-Camera Convolutional Color ConstancyCode1
CUDA-Optimized real-time rendering of a Foveated Visual SystemCode1
Dynamic Pooling Improves Nanopore Base Calling AccuracyCode1
Cross-Batch Memory for Embedding LearningCode1
Dyna-DM: Dynamic Object-aware Self-supervised Monocular Depth MapsCode1
Efficient Classification of Very Large Images with Tiny ObjectsCode1
Dynamic DNNs and Runtime Management for Efficient Inference on Mobile/Embedded DevicesCode1
EEEA-Net: An Early Exit Evolutionary Neural Architecture SearchCode1
Lightweight Neural Architecture Search for Temporal Convolutional Networks at the EdgeCode1
CutDiffusion: A Simple, Fast, Cheap, and Strong Diffusion Extrapolation MethodCode1
CritiPrefill: A Segment-wise Criticality-based Approach for Prefilling Acceleration in LLMsCode1
EfficientBioAI: Making Bioimaging AI Models Efficient in Energy, Latency and RepresentationCode1
DVIS: Decoupled Video Instance Segmentation FrameworkCode1
Transformer TrackingCode1
Show:102550
← PrevPage 24 of 113Next →

No leaderboard results yet.