SOTAVerified

GPU

Papers

Showing 150 of 5629 papers

TitleStatusHype
CUDA-L1: Improving CUDA Optimization via Contrastive Reinforcement Learning0
Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models0
DINO-VO: A Feature-based Visual Odometry Leveraging a Visual Foundation Model0
Kevin: Multi-Turn RL for Generating CUDA Kernels0
FourCastNet 3: A geometric approach to probabilistic machine-learning weather forecasting at scaleCode3
Arctic Inference with Shift Parallelism: Fast and Efficient Open Source Inference System for Enterprise AICode3
Relative Entropy Pathwise Policy OptimizationCode1
Lightweight Model for Poultry Disease Detection from Fecal Images Using Multi-Color Space Feature Optimization and Machine Learning0
DEARLi: Decoupled Enhancement of Recognition and Localization for Semi-supervised Panoptic SegmentationCode0
Scaling Attention to Very Long Sequences in Linear Time with Wavelet-Enhanced Random Spectral Attention (WERSA)Code0
HNOSeg-XS: Extremely Small Hartley Neural Operator for Efficient and Resolution-Robust 3D Image SegmentationCode0
Artificial Generals Intelligence: Mastering Generals.io with Reinforcement Learning0
From large-eddy simulations to deep learning: A U-net model for fast urban canopy flow predictionsCode0
AutoTriton: Automatic Triton Programming with Reinforcement Learning in LLMsCode2
Diffusion Dataset Condensation: Training Your Diffusion Model Faster with Less Data0
Real-Time Graph-based Point Cloud Networks on FPGAs via Stall-Free Deep PipeliningCode0
any4: Learned 4-bit Numeric Representation for LLMsCode2
LLMThinkBench: Towards Basic Math Reasoning and Overthinking in Large Language ModelsCode1
MathOptAI.jl: Embed trained machine learning predictors into JuMP modelsCode2
SketchColour: Channel Concat Guided DiT-based Sketch-to-Colour Pipeline for 2D Animation0
LoRA Fine-Tuning Without GPUs: A CPU-Efficient Meta-Generation Framework for LLMs0
FADRM: Fast and Accurate Data Residual Matching for Dataset DistillationCode1
MEMFOF: High-Resolution Training for Memory-Efficient Multi-Frame Optical Flow EstimationCode2
VolumetricSMPL: A Neural Volumetric Body Model for Efficient Interactions, Contacts, and CollisionsCode2
Instella-T2I: Pushing the Limits of 1D Discrete Latent Space Image Generation0
Omniwise: Predicting GPU Kernels Performance with LLMs0
GPU Kernel Scientist: An LLM-Driven Framework for Iterative Kernel Optimization0
Exploiting Lightweight Hierarchical ViT and Dynamic Framework for Efficient Visual TrackingCode1
Fast ground penetrating radar dual-parameter full waveform inversion method accelerated by hybrid compilation of CUDA kernel function and PyTorchCode1
DuoGPT: Training-free Dual Sparsity through Activation-aware Pruning in LLMs0
Scaling Speculative Decoding with Lookahead ReasoningCode0
MegaFold: System-Level Optimizations for Accelerating Protein Structure Prediction ModelsCode2
Virtual Memory for 3D Gaussian Splatting0
PocketVina Enables Scalable and Highly Accurate Physically Valid Docking through Multi-Pocket ConditioningCode2
DIP: Unsupervised Dense In-Context Post-training of Visual RepresentationsCode1
Efficient and Generalizable Speaker Diarization via Structured Pruning of Self-Supervised ModelsCode3
Let Your Video Listen to Your Music!0
Survey of HPC in US Research Institutions0
4D-LRM: Large Space-Time Reconstruction Model From and To Any View at Any Time0
CommVQ: Commutative Vector Quantization for KV Cache CompressionCode1
TDACloud: Point Cloud Recognition Using Topological Data Analysis0
Lightweight RGB-T Tracking with Mobile Vision Transformers0
Confucius3-Math: A Lightweight High-Performance Reasoning LLM for Chinese K-12 Mathematics LearningCode2
ShareGPT-4o-Image: Aligning Multimodal Models with GPT-4o-Level Image GenerationCode3
Collaborative Texture Filtering0
ConsumerBench: Benchmarking Generative AI Applications on End-User DevicesCode1
VeriLocc: End-to-End Cross-Architecture Register Allocation via LLM0
Beyond Blur: A Fluid Perspective on Generative Diffusion Models0
Speeding up Local Optimization in Vehicle Routing with Tensor-based GPU Acceleration0
TrainVerify: Equivalence-Based Verification for Distributed LLM Training0
Show:102550
← PrevPage 1 of 113Next →

No leaderboard results yet.