SOTAVerified

GPU

Papers

Showing 351400 of 5629 papers

TitleStatusHype
FlowR: Flowing from Sparse to Dense 3D Reconstructions0
Quattro: Transformer-Accelerated Iterative Linear Quadratic Regulator Framework for Fast Trajectory OptimizationCode1
Accelerating IoV Intrusion Detection: Benchmarking GPU-Accelerated vs CPU-Based ML Libraries0
Improved Visual-Spatial Reasoning via R1-Zero-Like TrainingCode1
SentenceKV: Efficient LLM Inference via Sentence-Level Semantic KV Caching0
Open-Qwen2VL: Compute-Efficient Pre-Training of Fully-Open Multimodal LLMs on Academic Resources0
SCRec: A Scalable Computational Storage System with Statistical Sharding and Tensor-train Decomposition for Recommendation Models0
Adapting Vision Foundation Models for Real-time Ultrasound Image Segmentation0
THEMIS: Towards Practical Intellectual Property Protection for Post-Deployment On-Device Deep Learning ModelsCode2
GPU-centric Communication Schemes for HPC and ML Applications0
Deep Learning Model Deployment in Multiple Cloud Providers: an Exploratory Study Using Low Computing Power Environments0
Orchestrate Multimodal Data with Batch Post-Balancing to Accelerate Multimodal Large Language Model Training0
StochasticSplats: Stochastic Rasterization for Sorting-Free 3D Gaussian Splatting0
Pan-LUT: Efficient Pan-sharpening via Learnable Look-Up Tables0
Cocktail: Chunk-Adaptive Mixed-Precision Quantization for Long-Context LLM Inference0
FastVAR: Linear Visual Autoregressive Modeling via Cached Token PruningCode2
CityGS-X: A Scalable Architecture for Efficient and Geometrically Accurate Large-Scale Scene Reconstruction0
PartialLoading: User Scheduling and Bandwidth Allocation for Parameter-sharing Edge Inference0
Disentangled 4D Gaussian Splatting: Towards Faster and More Efficient Dynamic Scene Rendering0
WeatherMesh-3: Fast and accurate operational global weather forecastingCode3
Scenario Dreamer: Vectorized Latent Diffusion for Generating Driving Simulation Environments0
CPPO: Accelerating the Training of Group Relative Policy Optimization-Based Reasoning ModelsCode2
Lobster: A GPU-Accelerated Framework for Neurosymbolic Programming0
FACETS: Efficient Once-for-all Object Detection via Constrained Iterative Search0
Stochastic Engrams for Efficient Continual Learning with Binarized Neural Networks0
ChatAnyone: Stylized Real-time Portrait Video Generation with Hierarchical Motion Diffusion Model0
Robust DNN Partitioning and Resource Allocation Under Uncertain Inference Time0
Self-ReS: Self-Reflection in Large Vision-Language Models for Long Video Understanding0
High Quality Diffusion Distillation on a Single GPU with Relative and Absolute Position Matching0
Bridging Evolutionary Multiobjective Optimization and GPU Acceleration via TensorizationCode7
AdaptiVocab: Enhancing LLM Efficiency in Focused Domains through Lightweight Vocabulary AdaptationCode0
A Probabilistic Neuro-symbolic Layer for Algebraic Constraint SatisfactionCode1
Scaling Down Text Encoders of Text-to-Image Diffusion ModelsCode2
Improved Alignment of Modalities in Large Vision Language Models0
PyGraph: Robust Compiler Support for CUDA Graphs in PyTorch0
Optimizing Breast Cancer Detection in Mammograms: A Comprehensive Study of Transfer Learning, Resolution Reduction, and Multi-View Classification0
Video-XL-Pro: Reconstructive Token Compression for Extremely Long Video Understanding0
Efficient Self-Supervised Adaptation for Medical Image AnalysisCode1
BitDecoding: Unlocking Tensor Cores for Long-Context LLMs Decoding with Low-Bit KV CacheCode2
Oaken: Fast and Efficient LLM Serving with Online-Offline Hybrid KV Cache Quantization0
GRiNS: A Python Library for Simulating Gene Regulatory Network DynamicsCode0
SceneSplat: Gaussian Splatting-based Scene Understanding with Vision-Language PretrainingCode3
Co-SemDepth: Fast Joint Semantic Segmentation and Depth Estimation on Aerial ImagesCode0
WindowKV: Task-Adaptive Group-Wise KV Cache Window Selection for Efficient LLM InferenceCode0
Temporal Action Detection Model Compression by Progressive Block Drop0
UniCon: Unidirectional Information Flow for Effective Control of Large-Scale Diffusion Models0
PP-DocLayout: A Unified Document Layout Detection Model to Accelerate Large-Scale Data ConstructionCode9
Robustness of deep learning classification to adversarial input on GPUs: asynchronous parallel accumulation is a source of vulnerability0
Splat-LOAM: Gaussian Splatting LiDAR Odometry and MappingCode2
Improving the End-to-End Efficiency of Offline Inference for Multi-LLM Applications Based on Sampling and Simulation0
Show:102550
← PrevPage 8 of 113Next →

No leaderboard results yet.